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This  report  presents  the  results  of  a study  and  investigation  of  software 
reliability  models.  In  particular,  the  purpose  was  to  investigate  the  statis- 
tical properties  of  selected  software  reliability  models,  including  the  statis- 
tical properties  of  the  parameter  estimates,  and  to  investigate  the  goodness  of 
fit  of  the  models  to  actual  software  error  data.  The  results  indicate  that  the 
models  fit  poorly,  generally  due  to  in  most  part  the  vagarlef  of  the  data 
rather  than  shortcomings  of  the  models. 
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EVALUATION 


The  increasing  requirements  for  highly  reliable  software  systems  in  such 
applications  as  command  and  control  and  avionics  has  led  to  the  need  for 
better  specification  of  reliability  requirements  in  software  developments  and 
for  techniques  for  assessing  the  reliability  of  a delivered  software  system. 
This  need  has  been  expressed  in  numerous  industry  and  government  sponsored 
conferences,  as  well  as  documents  such  as  the  Joint  Commanders'  Software 
Reliability  Working  Group  Report  (November  1975).  As  a result,  numerous 
efforts  have  been  initiated  to  develop  mathematical  techniques  for  predicting 
the  reliability  of  software  systems.  However,  the  application  of  these 
techniques  has  been  delayed  because  they  have  not  been  subjected  to  sufficient 
validation  efforts  to  check  both  model  mathematics  and  model  predictions. 

This  effort  was  initiated  in  response  to  this  need  for  model  validation 
and  fits  into  the  goals  of  RADC  TPO  No.  5,  Software  Cost  Reduction  in  the 
subthrust  of  Software  Quality  (Software  Modeling).  This  report  summarizes 
the  validation  of  several  different  mathematical  models  for  predicting  the 
reliability  and  error  content  of  a software  system  against  software  error  data 
extracted  from  actual  Department  of  Defense  software  developments.  This 
validation  effort  has  investigated  these  models  in  terms  of  not  only  the 
validity  of  model  assumptions  and  the  properties  of  model  parameter  estimators, 
but  also  in  terms  of  model  predictability  based  upon  statistical  goodness-of- 
fit  tests.  The  importance  of  this  validation  effort  is  that  it  provides  the 
necessary  statistical  properties  and  model  experimentation  to  permit  the  use 
of  the  investigated  models  on  actual  ongoing  software  developments. 

The  results  of  this  model  validation  effort  will  lead  to  the  use  of 
these  software  reliability  prediction  models  by  Air  Force  software  managers 
to  accurately  assess  the  status  of  their  developments  early  enough  to  correct 
potential  problems  without  increasing  significantly  the  lift  cycle  costs. 

The  results  of  this  validation  effort  will  also  permit  the  specification  of 
software  reliability  requirements  on  future  software  RTFs  by  insuring  that  the 
techniques  do  exist  to  verify  that  the  delivered  software  meets  these  require- 
ments. Finally,  the  models  validated  under  this  effort  will  be  applicable  to 
current  software  development  projects,  thus  helping  to  increase  the  reliability 
and  lower  the  life  cycle  cost  of  todays  software  systems. 

OJUv*  n 

.ALAN  N.  SUKERT 
Project  Engineer 
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SUMMARY 


'Hie  purpose  of  this  study  was  the  validation  of  existing  software  (S/W)  reliability 
models.  This  validation  was  accomplished  by  investigating  the  properties  of  model 
parameter  estimates,  by  investigating  the  validity  of  the  model  internal  assumptions, 
and  by  analyzing  the  goixlncss-of-fit  of  the  models.  These  investigations  were  all 
made  in  terms  of  actual  S/W  error  data  from  numerous  (16)  electronic  system  com- 
puter programs  which  represented  a wide  variety  of  system  types. 

The  types  of  S AY  reliability  nnxlcls  studied  were  basically  two;  Poisson  and  Bi- 
nomial models,  ihe  methods  of  parameter  estimation  investigated  were  also  two:  the 
maximum  likelihood  method  and  the  least  squares  method  . 

The  following  results  were  obtained. 


Methods  of  Estimation 

• Both  methods  of  estimation  require  numerical  methods  of  solution  for  parameter 
estimation.  Ordinarily,  in  this  age  of  the  digital  computer,  this  would  be  no  prob- 
lem. However,  definitive  methods  of  determining  the  required  starting  points  (for 
the  numerical  method)  were  not  available  and  convergence  problems  were  observed: 
either  no  convergence  was  obtained  or  convergence  to  absurd  (e.g.  negative  values 
for  parameters  known  to  be  positive)  values. 

• It  was  established  that,  with  respect  to  precision,  the  MLEs  were  superior  to  the 
LSEs. 

c The  MLE  method  seemed  to  experience  slightly  more  convergence  problems  than 
the  LSE  method. 


Models 

• The  best  fitting  model  was  the  most  general  (i.e. , 3 parameters)  of  the  Poisson 
models.  A major  advantage  of  these  Poisson  models  is  that  they  permit  distinction 
between  errors  observed  and  errors  corrected. 


• It  was  clear  that  the  fits  were  poor  for  all  models.  That  is,  poor  fit  was  the 
uniformly  common  result. 

• The  poor  fits  were  due  to  several  causes: 

i)  all  of  the  model  internal  assumptions  are  not  alid.  For  example,  none  of  the 
models  accommodate  the  introduction  of  errors  in  the  debugging  process. 

ii)  the  S/W  debugging  process  is  likely  much  more  complex  than  the  models 
assume. 

iii)  the  models  seemed  over-sensitive  to  the  failure  of  data  to  "follow"  the  inter- 
nal model  assumptions. 


Data  Base 

• Generally,  the  data  sets  failed  to  show  a decreasing  error  rate  as  (calendar)  time 
increases:  even  apart  from  random  flucuations.  This  property  of  the  data  played 
havoc  with  the  model  fits.  We  suspect  the  data  displayed  this  property  because 

i)  errors  were  introduced  during  the  process  of  correcting  discovered  errors. 

ii)  the  man-power  effort  was  not  constant  throughout  the  development  process, 

iii)  all  of  the  S/W  was  not  subject  to  test  all  of  the  time. 

In  short,  while  a great  deal  of  useful  work  has  been  accomplished  in  S/W  reliabil- 
ity modeling,  more  work  remains:  mostly,  we  feel,  in  the  area  of  S/W  error  data 
collection. 


Section  1.0 


INTRODUCTION 


The  objective  of  this  effort  was  to  validate  selected  existing  S/W  reliability 
models.  In  particular  the  specific  objectives  were 

i)  With  regard  to  estimating  the  parameters  in  the  models 

• identify  the  MLEs  and  LSEs 

• identify  conditions  under  which  the  estimates  do  not  exist 

* 

• investigate  the  statistical  (small  sample  and  large  sample) 
properties  of  the  estimates 

ii)  Validate  internal  assumptions  of  the  model  in  terms  of  S/W  develop- 
ment processes 

iii)  Test  the  goodness-of-fit  of  the  models  on  actual  S/W  error  data 

iv)  Develop  methods  of  predicting  the  time  necessary  to  remove  a specified 
percent  of  remaining  errors  for  each  model 

v)  Compare  efficiency  of  the  various  models. 

We  were  successful  in  accomplishing  all  of  these  objectives  except  we  were  not 
able  to  work  out  the  small  sample  exact  estimator  probability  distributions  nor 
were  we  able  to  obtain,  except  by  simulation,  other  small  sample  properties; 
the  problem  being  one  of  intractability  of  the  analytical  problems  ahd  the 
inability  to  find  appropriate  pivotal  functions  (functions  of  the  parameters  and 
the  estimator(s)  distributed  free  of  the  other  parameters). 

S/W  reliability  modeling  is  a tough,  very  tough,  business  in  the  following 
sense.  First,  no  model,  however  attractive,  will  ever  be  accepted  without 
verification  with  actual  S/W  error  data;  and  rightfully  so.  Now  the  physical, 
stochastic  process  which  generates  the  S/W  error  data  is  extremely  complex 
and  it  is  difficult  to  invent  models  with  enough  parameters  to  reflect  the 
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process.  The  problem  even  goes  deeper  as  illustrated  by  the  following  idea.  It  is 
entirely  plausible  that  as  SAV  is  debugged  (and  hence  errors  are  discovered  and 
corrected)  errors  are  introduced.  Now  it  is  not  impossible  to  modify  the  models 
(except  for  the  IBM  Model  in  a limited  sense,  none  of  the  models  we  investigated 
accommodate  the  introduction  of  errors)  to  include  a mechanism  to  allow  lor  intro- 
ducing errors.  A most  obvious,  and  simple,  way  is  to  include  a probability,  say  pj, 
that  an  error  is  introduced  at  each  removal.  But  is  such  a parameter,  namely  pj, 
estimable?  That  is,  can  it  be  estimated  at  all?  It  is  unlikely  that  it  can  no  matter 
how  carefully  the  S/W  data  is  "taken",  for  in  order  to  estimate  pj  one  would  have  to 
have,  in  some  form  or  other,  the  number  of  errors  introduced  at  each  removal  and/or 
discovery.  It  is  doubtful 

i)  that  anyone  knows  that  an  error  has  been  introduced  or  he  wouldn't  have 
introduced  it 

ii)  that  the  error  data  as  it  occurs  could  be  separated  into  those  initially 
present  and  those  introduced  in  correction/removal. 

More  to  the  point  is  the  remark  that  there  are  more  serious  difficulties  in  validating 
S/W  reliability  models  than  not  counting  induced  errors.  The  history  of  mathematical 
modeling  is  replete  with  examples  of  situations  where  the  model  "oversimplified"  the 
process  and  yet  was  astonishingly  successful  in  modeling  the  process.  The  problem 
with  S/W  reliability  modeling  is  the  data  base;  in  particular  the  collection  of  data 
concerning  concomitant  variables.  That  is,  there  arc  variables  which  have  measur- 
able and  important  effects  on  the  number  of  errors  observed  over  any  particular  time. 
Such  a variable  is  the  number  of  man-hours  per  calendar  hour  devoted  to  the  debugging 
process.  This  number  is  certainly  rarely  constant  throughout  the  debugging  process. 
There  are  other  such  variables;  moreover,  it  is  difficult  to  incorporate  all  of  them 
in  the  various  models;  the  models  would  contain  too  many  parameters  and  the 
mathematics  of  the  estimation  process  would  become  intractable.  Thus  the  values 
assumed  by  these  concomitant  (not  included  in  the  model)  variables  must  be  recorded 
so  that  the  data  may  be  adjusted. 

In  any  event  the  objectives  previously  mentioned  were  approached  in  terms  of 
i)  three  (.1)  Poisson  type  S/W  reliability  models:  Jelinski-Moranda  (J-M),  Shick- 
Wolverton  (S-W)  and  a generalized  Poisson  model  (GPM),  ii)  a binomial  model  and 

iii)  an  IBM  non-homogeneous  Poisson  process  type  model  which  also  arises  in  the 
context  of  (hardware)  reliability  growth. 

The  required  S/W  error  data  base  was  sixteen  (1(>)  data  sets  representing  a 
variety  of  types  of  systems  (i.e. , of  hardware  systems). 

The  balance  of  this  report  describes  the  details  and  analyses  involved  in 
achieving  the  report  objectives. 
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2.  1 DATA  BASE:  GENERAL  DESCRIPTION 

The  data  base  lor  this  S/W  reliability  study  was  constructed  from  a total  of  12 
sources;  three  internal  to  Hughes,  and  nine  from  external  sources.  These  sources 
represented  a wide  range  of  S/\V  applications  including  command  and  control,  sonar, 
radar  (ground  fixed,  ground  mobile,  shipboard),  guidance  and  navigation,  communi- 
cations, and  general  information  processing  systems.  A summary  of  this  data  is 
given  in  Table  2. 1. 1 along  with  the  time  period  over  which  each  set  of  data  was 
collected,  number  of  errors  observed,  language  used,  and  approximate  number  of 
instructions. 


2.2  DESCRIPTION  OF  DATA  SETS  AND  PROCESSING  RAW  DATA 

With  the  exception  of  the  reference  (II,  2)  data,  all  data  for  the  S/W  study  was 
extracted  'rom  raw  data  which  included  such  information  as  SPR  or  PTR  number, 
routine  suspected  of  having  the  error,  date  of  error  occurrence,  an  error  category 
code,  date  of  error  fix,  and  various  comments  concerning  the  nature  of  the  problem 
and  how  the  fix  w'as  made.  The  data  needed  for  the  s/W  models  was  the  number  of 
errors  observed  and  removed  in  successive  debugging-time  intervals  (the  cut-off 
times  for  these  time  intervals  were  determined  bv  dates  when  clustering  of  observa- 
tions occurred).  The  number  of  errors  and  fixes  and  the  length  of  any  given  debugging- 
time interval  could  not  be  extracted  directly,  however.  The  SPR's  or  PTR's 
corresponding  to  certain  error  category  codes  were  deleted  from  the  raw  data  because 
the  error  codes  did  not  correspond  to  S/W  failures.  The  types  of  "errors"  described 
by  these  inadmissible  codes  include  user  requested  changes,  duplicate  reports, 
documentation  errors,  hardware  related  reports,  and  questions  (i.e. , the  SPR  or 
PTR  was  used  simply  to  make  a statement  or  ask  a question).  Having  removed  these 
non-S/W  failures  it  was  then  necessary  in  computing  the  debugging-time  interval 
lengths  to  remove  Saturdays,  Sundays,  and  holidays  since,  in  general,  the  debugging 
ceased  on  these  days.  If  an  SPR  or  PTR  was  opened  or  a fix  made  on  a hoHday  or 
weekend,  then  that  holiday  or  that  entire  weekend  was  counted  in  the  debugging-time 
interval  length. 


*PWS  macros  not  measurable  in  comparable  units. 


TABLE  2.  1. 1.  DESCRIPTION  OF  DATA  SETS  (CONT) 
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♦♦Unknown. 

T 15A,  15B,  15C  represent  distinct  computer  programs  making  up  15.  The  sum  of  the  errors  found  in  15A 
15B,  15C  do  not  add  to  447  because  they  were  adjusted  for  varying  manpower  assignments  and  percent  of 
total  software  being  tested. 


2.3  FURTHER  PROCESSING  OF  DATA 

A common  assumption  in  S/W  reliability  models  is  that  all  errors  discovered  in 
a given  debugging-time  interval  are  removed  at  the  end  of  that  interval.  All  raw 
data  used  in  this  study  indicated  that  it  was  impossible  to  choose  the  debugging-time 
intervals  in  this  way  because  of  the  way  find-dates  and  fix-dates  were  clustered. 

In  view  of  this  difficulty,  three  types  of  data  sets  were  constructed  from  the  raw  data 
(see  Figure  2.2.1).  These  consisted  of  the  number  of  "finds"  for  successive 
debugging-time  intervals,  the  number  of  "fixes"  for  successive  debugging  intervals, 
and  a combination  of  these  whereby  a column  of  data  with  the  cumulative  number  of 
fixes  made  as  a function  of  debugging-time  was  combined  with  the  "find"  data 
(examples  of  these  are  shown  in  Table  2.2.1).  The  models  can  be  fit  to  the  "find" 
data  by  assuming  that  the  errors  are  actually  removed  at  the  end  of  each  debugging- 
time interval  and  that  the  "fix"  dates  are  inaccurate.  Alternately,  the  models  can  be 
fit  to  the  "fix"  data  assuming  that  the  number  of  finds  in  a particular  debugging-time 
interval  between  successive  clusters  of  fixes  is  equal  to  the  number  of  fixes  in  that 
time  interval.  Finally,  when  possible,  the  models  can  utilize  the  combined  find/fix 
data  which  result  in  a different  number  of  "fixes"  at  the  end  of  a debugging -time  inter- 
val than  "finds"  within  that  interval.  (These  three  data  types  are  easily  utilized  by 
most  of  the  models  studied  here  by  defining  the  cumulative  number  of  errors  removed 
by  the  end  of  each  debugging -time  interval  as  a known  parameter  instead  of  as  the 
cumulative  sums  of  the  sequence  of  observations  of  errors  which  arc  random 
quantities.  See  Section  3.0  for  further  discussion.) 

As  a final  data  preprocessing  effort,  each  "find,"  "fix,"  and  combined  "find"  and 
"fix"  data  set  was  converted  so  that  within  each  debugging-time  interval,  at  least 
10  errors  were  observed.  This  was  done  bv  combining  successive  time  intervals 
until  the  total  number  of  errors  observed  exceeded  or  equaled  10.  This  effort  was 
initiated  to  insure  the  validity  of  the  limiting  distribution  of  the  statistic  used  for 
testing  goodness -of-fit  (see  Section  5.2), 

Table  2.2.2  shows  the  numbering  conventions  used  for  each  data  set. 


TABLE  2.2. 1.  EXAMPLES  OF  DATA  SETS 


"Find"  (or  "Fix")  Data  Set 


Time  Interval 

Cumulative  Time 

Cumulative  "Finds" 

"Finds"  (or  "Fixes") 

Length 

(Days) 

(or  "Fixes") 

in  Time  Interval 

(Days) 

34 

5 

5 

34 

35 

15 

10 

1 

39 

16 

1 

4 

43 

• 

20 

• 

4 

. 

4 

• 

• 

• 

• 

o • 



. 

• 

Combined  "Find"  and  "Fix"  Data  Set 


Time  Interval 

Cumulative  Time 

Cumulative 

"Finds"  in 

Length 

Cumulative 

(Days) 

"Finds" 

Time  Interval 

(Days) 

"Fixes" 

2 

1 

1 

2 

1 

3 

3 

2 

1 

1 

15 

4 

1 

12 

2 

29 

5 

1 

1 

14 

3 

1 
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TABLE  2.2.2.  DATA  SET  NUMBERING  CONVENTION 


15A- 


LData 

Source 

Designator 

(See 

Table  2. 1. 1) 


Data  Type  Designator: 

ff  — Combined  "find"  and  "fix" 

ffG  — Combined  "find"  and  "fix"  grouped  for 
?10  observations  per  time  interval 

fd  - "Find" 

fdG  - "Find"  grouped  for  > 10  observations 
per  time  interval 

fx  - "Fix" 

fxG  - "Fix"  grouped  for  >10  observations  per 
time  interval 


i 


i 

I 

F 

l 

I 
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Section  3. 0 


DESCRIPTION  OF  MODELS 


3.1  GENERALIZED  POISSON  MODEL 

The  generalized  Poisson  model  (GPM)  is  a generalization  of  the  S-VV  and  J-M 
models  discussed  in  references  (III,  9)  , (111,10)  , (111,20)  , and  (111,24)  . The  GPM 
assumes  that  the  number  of  errors  observed  in  the  ith  debugging -time  interval 
Tj  has  a Poison  dis  tri bution  with  mean  value 

E(N.)  d>  (N-M._1) 

where  <t>  is  a constant  of  proportionality,  N is  the  total  number  of  errors  present  in 
the  S/W  initially,  and  Mj  is  the  total  number  of  errors  removed  up  to  the  end  of  the 
jth  debugging-time  interval.  It  is  assumed  that  when  errors  are  .removed,  they  are 
removed  at  the  ends  of  the  debugging-time  intervals.  Some  authors  have  assumed 
that 


J 

M \ N; 

.1  — J 

i=  1 

i.e. , that  :dl  errors  found  in  each  debugging-time  interval  are  removed  at  the  end  of 
that  interval.  This  is  an  unnecessary  assumption  and  practically  is  never  satisfied. 

We  may  treat  Mq  5 0,  Mj,  M2, as  known  constants  and  assume  that 

successive  observations  Nl,  N’2v  •■>«,  are  independent.  The  joint  probability 

function  of  N , . . . , N.  is 
1 Ik 

1 


• • • 


It  should  be  pointed  out  that  if  tj.  = t2  = ...  = t^,  then  the  parameter  a (o  describes 
the  degree  of  dependence  of  the  expected  number  of  errors  on  the  length  of  the  time 
interval)  is  not  identifiable,  i.e. , t may  be  absorbed  into  the  constant  4>  and  the 
model  reduces  to  a two  (N,  <t>)  parameter  model.  Because  of  the  assumptions 


Var  (Nj)  = <(>  (N-M._1)  t ^ 

Cov  (N.,  Nj)  = 0,  i / j. 

For  the  case  when  o = 1,  this  model  becomes  the  J-M  model  and  when  a = 2,  it 
becomes  the  S-W  model.  In  general,  o should  be  positive  since  the  number  of  errors, 
on  the  average,  observed  in  a time  interval  should  increase  as  the  time  interval 
length  increases,  and  should  decrease  to  0 as  the  time  interval  length  shrinks  to  0. 


3.2  IBM  MODEL 

The  IBM  model  is  based  on  the  non-homogeneous  Poisson  process  (see (II, 2)  . The 
basic  assumptions  concerning  the  non-homogeneous  Poisson  process,  say  N(t),  are: 

1.  N(t),  t 2 0,  is  an  integer  valued  random  variable  with  N(0)  = 0 almost 
surely. 

2.  For  0 < tj  <t2  < ...  < tfc,  the  random  variables  N(ti),  N(t2)-N(tj), 

N(t4)-N(t3),  N(tfc)  - N(tk_i)  are  stochastically  independent. 

3.  P f N(t+h)  - N(t)  > 2}  = o(h)*  as  h — 0 + 

4.  P i N(t+h)  - N(t)  = 1)  = \(t)h  + o(h)  as  h — 0 + where  \(t)  is  integrable 
over  [0,  T]  for  all  T,  \(t)  > 0 for  all  t 2 0. 

From  these  assumptions  the  probability  distribution  of  N(t)  -N(tft)  can  be  derived  for 
any  t > > 0 . 

Let  Pn(t)  = P f N(t)  - N(tQ)  = n} , n = 0,  1,  ....  Then,  $ 

PQ  (t+h)  = PQ(t)  (1  - \(t)h)  + o(h)  as  h - 0 +. 

Rearranging  and  dividing  by  h > 0 and  letting  h -*  0, 

dPn(t) 

— v — = -\(t)P  (t)  with  the  initial  condition  Pn(tn)  = 1.  Solving  this  linear 
dt  u 00 

differential  equation  yields 

PQ(t)  = exp  (-J t)  dt  + c) 


*f(h)  = o(h)  as  h 


0 + means  f(h)/h  ■*  0 as  h 


0 and  h > 0 


and  utilizing  the  initial  condition  gives 


P (t)  = exp  ( - / \(s)  ds 


Letting  m(t)  = / \(s)  ds,  we  see  that 
0 

PQ(t)  = exp  (-  (m(t) -m(t0»). 

For  n 2 1,  we  have  the  difference  equation 

Pn(t+h)  = Pn(t)  (l-hX(t))  + Pnl(t)  h\(t)  + o(h)  ash  — 0 + 

which  leads  to  the  differential  equation 

dPn(t)  = - \(t)  Pn(t).+  X (t)  Pn _1(t),  n = 1,  2,  .... 
dt 

This  system  of  differential  equations  may  be  solved  recursively,  having  found  PQ(t). 

P • S* » 

dpi(t)  = - X(t)  P1(t)  + \(t)  P0(t) 
dt 

= - X (t)  P^t)  + X (t)  exp  (-  (m(t)  -m(t0))) 

with  initial  condition  P1(tQ)  = 0.  This  differential  equation  leads  to 
P^t)  = (m(t)  - m(tQ))  exp  (-  (m(t)  - m(t0»). 

Continuing  in  this  fashion  gives 

Pn(t)  = PfN(t)-N(t0)  = n] 

_ Lm(t) -m(t0)]n  e-(m(t) -m(t0)) 


for  n — 0,  1,  2,  .... 

For  software  reliability  modeling,  N(t)  can  be  taken  to  be  the  total  number  of 
errors  detected  in  time  (0,  t).  The  expected  number  of  error  occurrences  in  (0,  t), 
E(N(t))  = m(t),  is  assumed  to  satisfy  the  following  conditions: 

m(t+h)  = m(t)  + bh(a-m(t))  + o(h)  as  h — 0+, 
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where  b > 0 is  a constant  of  proportionality,  and  a > 0 is  the  expected  number  of 
errors  to  be  found  in  infinite  time,  i.e., 

a = lim  m(t) 

t — CO 

(We  have  required,  of  course,  that  a be  finite.)  That  is,  the  ejpected  number  of 
errors  in  time  [0,  t+h]  is  approximately  the  expected  number  of  errors  in  time  (0,  t) 
plus  an  amount  proportional  to  h and  the  expected  number  of  errors  observable  in  time 
(t,  ) (analogous  to  the  number  of  errors  remaining,  N-Mj,  for  the  generalized  Poisson 

model).  Dividing  the  difference  equation  for  the  mean  value  function  by  h > 0 and 
letting  h — - 0 we  obtain 


(a-m(t)) 


with  initial  condition  m(0)  = 0.  The  solution  is,  of  course, 
m(t)  = a (l-c‘bt),  t 2 0. 

To  summarize  these  considerations,  we  have  for  any  t > > 0, 


PfN(t)-N(tn)  = n } 


f / -btft  — bt n / -bt„  — bt  \ 
i a ic  0-e  Jj  ^-ale  0 - e ) 


and  for  tQ  = 0, 


P { N(t)  = n}  = ' a(1-e^  -ll  e"a  (l  e ) 
n! 

For  0 s t^  < t2  < ...  < t^,  the  joint  probability  function 

P (N(tj)  = n^  ...,  N(y  = r^} 

is  difficult  to  derive.  However,  the  joint  probability  function  of  the  increments 
= Nfl^),  Z2  = N(t2)-N(t1),  ...,  Zk  = N^) -N(tk_1) 

is  straight-forward  since  the  increments  are  independent.  We  have 

r , ■ . i_.  v “|Z. 


■i  7i'  •••’  zk_  -TT 


k r < -bti-i  -bii 

nr  a\e  - e 


) 1 ( _bt-  i -bt.  \ 

/J  -a\e  l-l  - e \) 


where  tQ  s 0. 
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Since  for  t > t^  2 0,  N(t)  -N(y  has  the  Poisson  distribution  shown  above, 

E (N(t)-N(tQ))  = a(e_bt0  - e'bt) 

Var  (N(t)-N(t0))  = a(e“bt0  - e~bt). 

In  fitting  the  IBM  model  to  data,  the  data  needed  are  the  cumulative  times 
0 < ti  < t2  < ...  < tfc  and  cumulative  errors  N(ti)  < N(t2)  < ...  < N(tk)  observed. 
The  cumulative  errors  are  converted  to  increments  Z\  since  the  LSEs  and  MSEs  are 
derived  in  terms  of  the  increments  rather  than  the  cumulative  errors.  This  is  some- 
what different  than  what  was  done  in  (11,2)  . There,  the  LSE's  of  a and  b were 
derived  ty  minimizing 

V'  / / “bt 

S(a,  b)  = ^N(t.)  - a ( 1-c  1 

i=l 

However,  when  testing  goodness-of-fit,  it  will  be  shown  (see  Section  5.2)  that 
utilizing  the  increments  Zi  rather  than  the  cumulative  errors  in  fitting  the  models 
makes  complete  use  of  the  data  whereas  using  cumulative  errors  makes  use  of  only 
one  data  point  out  of  the  k observations. 

In  the  discussion  of  Section  5.2,  it  will  be  necessary  to  know  the  covariance 
matrix  for  the  normalized  variables 

Nftj)  - m(y  _ N(y-m(y 
Yf  - , . . . , Y.  = 3ZI  * 

Vm(t k> 

i.e.,  cov  (Yi,  ...,  Yk)  =N 


where 


N = E(Y.  Y),  i = 1,  ...,  k,  j = 1,  ...,  k. 
“ij 

For 


i = j.  ^ =5  = E(Yf)  = 1. 

ij  “ii 
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For  i > j,  we  have 


« 


= E(Y.  Y.)  = E 

.J 


l 


(N(t.)-m(t.))  (N(t.)-m(t.))  | 

Vm(ti} 

(N(t.)-m(t.))  (N(t.)-m(t.)) 


m(t.) 


\f  m(t.) 


(N(t.)-m(t.)) 


V~ "(v 


(N(t.)-m(t.))2 

m(V 


= E [(Y.  - Y.)  Y.  j + E(Y2)  = E(Y2)  = 1 

Since  Yj  - Yj  and  Yj  arc  independent.  Thus,  the  covariance  matrix  is  simply  a k \ k 
matrix  of  l's: 


As  mentioned  earlier,  the  constant  "a"  is  analogous  to  the  constant  N in  the  GPM. 
There  is  a very  important  difference,  though.  N is  the  number  of  errors  initiallv  in 
the  software  for  the  GPM  whereas  a is  the  expected  number  of  errors  which  could 

be  observed  in  unlimited  time,  i.e.,  if  we  let  N(°°)  denote  t‘^N(t)  (a  well  defined 

random  variable),  E (N(*x>))  = a since  N(co)  has  a Poisson  distribution  with  mean  a. 

Since  N(oo)  is  not  constant  almost  surely,  one  can  interpret  the  IBM  model  as  one  in 
which  errors  are  allowed  to  be  introduced  during  debugging. 

As  a final  observation  on  the  IBM  model,  it  should  be  pointed  out  that  when  the 
data  actually  behaves  as  though  they  were  generated  from  a homogeneous  Poisson 
process,  i.e.,  m(t)  = \t  for  some  positive  const:mt  \,  the  parameter  estimators 
will  reflect  this  by  the  estimate  of  b decreasing  towards  zero  while  the  estimate  of  a 
increases  without  bound,  with  the  product  ab  remaining  constant.  That  is,  letting 
ab  = \ or  a = X./b,  we  see  that 

fim  m(t)=  £im  a(l-e_bt)  = ^im  A (1_e~bt 
b — 0 b — 0 b — • 0 b 

= fim  -g  (1  - 1 - bt  + o(b)  •)  = Xt, 


the  mean  value  function  for  a homogeneous  Poisson  process.  This  happened 
frequently  as  shown  in  Tables  5. 1. 1 through  5. 1. 16. 
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3.3  BINOMIAL  MODEL 


The  binomial  models  discussed  here  are  based  on  a model  suggested  by 
P.  A.  P.  Moran  in  (11,5)  . Basically,  it  is  assumed  that  there  are  N errors  present 
initially.  Errors  arc  then  observed  in  successive  time  intervals  tj,  t.,  . ..,  t ^ 
under  the  assumption  that  at  the  end  of  each  time  interval,  the  errors  observed 
during  that  time  interval  are  removed  from  the  S/W,  thus  decreasing  the  total 
number  of  errors  in  the  S/W  by  that  number.  As  with  the  GPM  (and  the  S-W  and  J-M 
models)  no  consideration  is  given  to  the  possibility  of  creating  new  errors  in  the 
process  of  removing  errors.  The  joint  probability  function  is  derived  under  the 
assumption  that  Ni,  the  number  of  errors  observed  in  r j,  has  a binomial  distribution 
with  the  n-parameter  equal  to  the  total  errors  remaining  in  the  S/W,  and  the 
p-parameter  equal  to  some  function  of  t j and  possibly  an  unknown  constant.  This 
function  must  satisfy  the  following  assumptions  (denoting  the  function  by  p(t})): 

1.  0 < p(tj)  ? 1 for  all  Tj  > 0 

2.  p('i)  is  an  increasing  function  of  t j ^ 0 

3.  p(0)  = 0,  l im  p(t j)  = 1. 

7*  . ao 

Since  p(  ,)  is  interpreted  as  a probability  of  detection  in  time  t i,  assumption  1 is 
necessary  while  assumptions  2 and  3 mean  that  given  no  time,  no  errors  can  be 
detected  while  given  unlimited  time,  an  error  will  certainly  be  detected. 

Letting  pj  p(Tj)  we  may  now  derive  the  joint  probability  function  for  Nj,  ...»  Nfc. 
Clearly,  the  probability  function  for  Nj  is 

/ N \ ni  N"ni 

P f N j - n j 1 = I 1 Pj  (1-pj)  , nj  = 0,  ...»  N. 

' nl 

Given  Nj,  the  probability  function  for  N’2  is 

/N_nl\  n N-n  -n 

P n2  n2  1 Ni  " ni  - " y J P2  ~ (1-p2> 

for  = 0,  1,  ...»  N-nj  so  that  the  joint  probability  function  of  Nj  and  N2  is 
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ior  nj  = u,  x,  . . . , 


N,  n2  = o,  1,  . . . , 


N-rij.  Continuing  in  this  fashion,  we  get 


Two  different  functions  were  considered  for  p(T.).  The  first  was 
p(T.)  = T /(T  +c),  c > 0 


Tl'N'l 


1 


which  corresponds  to  Binomial  I in  subsequent  sections,,  This  function  clearly 
satisfies  the  assumptions  1,  2,  and  3 and  arises  as  the  probability  distribution  of  the 
ratio  of  two  suitably  chosen  independent  exponential  random  variables. 

The  second  choice  was 
-a  r . 

p(r.)  = 1-e  a >0. 

This  function  clearly  satisfies  assumptions  1,  2,  and  3 and  corresponds  to  Binomial  II 
in  subsequent  sections.  This  function  is  recognized  as  the  exponential  distribution 
function. 

The  main  rationale  for  choosing  these  functions  was  the  presence  of  only  one 
unknown  parameter  along  with  ease  of  computation  using  these  functions. 

The  binomial  models  are  not  appropriate  for  use  with  the  combined  "fix"  and 
"find"  data  since  the  errors  observed  in  each  time  interval  are  assumed  removed  at 
the  end  of  their  respective  time  intervals.  Also,  the  development  of  MLE's  for 
unknown  parameters  N and  a or  c is  very  complicated  and  costly  due  to  the  difficulties 
involved  with  differentiating  combinatorials  and  factorials.  It  is  possible  to  introduce 
approximations  (e.g.,  Stirling's  approximation  for  large  factorials)  as  in  (11,5)  but  the 
conditions  under  which  these  would  be  valid  could  not  possibly  apply  in  all  applica- 
tions. Hence,  only  LSEs  were  derived  in  this  study  for  the  binomial  models  by 


(N.  - E(N.|N1,  ...,  N._1))2 


to  the  unknown  parameters  N and  a or  c (see  Section  4. 1.4). 


minimizing 

h 

V 

i=l 

with  respect 
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Section  4. 0 


METHODS  OF  ESTIMATION 


4.1  TECHNIQUES  OF  ESTIMATING  MODEL  PARAMETERS 

4.1.1  Basic  Statistical  Techniques 

Two  different  statistical  methods  were  used  to  estimate  the  parameters  of  the 
models  of  Section  3.  These  methods  were  designed  to  yield  a set  of  parameter  esti- 
mates for  a data  set.  Among  these  sets,  the  most  often  estimated  parameter  was  N, 
the  unknown  total  number  of  errors  present.  In  the  IBM  growth  model  of  Section  3.2, 
the  estimate  of  the  parameter  a represents  the  total  number  of  errors  expected,  and 
includes  errors  added  through  performing  corrections.  The  other  parameters  esti- 
mated in  each  set  relate  to  the  shaping  and  scaling  of  the  time  to  error  distributions. 

The  basic  methods  employed  were  those  of  least  squares  estimators  (LSEs)  and 
maximum  likelihood  estimators  (MLEs). 

The  LSEs  of  a parameter  (or  set  of  parameters),  say  p,  are  those  values,  p, 
which  minimize  the  sum  of  the  squares  of  the  difference  between  the  observed  values 
of  a random  variable,  X,  and  expectations,  E(X).  I.  e. 

k 

S = ^ (X.  - E(X. ))“ 

i i 

i=l 


must  be  minimized  where  the  value  E(Xj)  is  dependent  upon  the  value  of  p. 

For  the  maximum  likelihood  method  of  obtaining  estimators,  the  probability  den- 
sity function,  f(X),  is  needed.  The  likelihood  function  L is  the  joint  probability  func- 
tion of  the  k observations  of  the  random  variable  X.  The  MLE's  are  those  values  p 
of  the  parameters  which  jointly  maximize  the  function  L for  a fixed  set  of  k observa- 
tions. The  values  for  p must  be  in  the  allowable  parameter  space;  e.g.  an  MLE  for 
the  variance  of  the  random  variable  must  lx?  non-negative. 

For  observations  of  the  random  variables  considered  here,  the  probability  density 
function  and  the  square  (X-E(X))2  (which  depends  upon  the  parameter  p)  arc  both 
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positive.  Thus  both  logarithm  L and  logarithm  S are  well  defined  functions.  Since  the 
logarithmic  function  is  monotone  increasing,  the  maximum  values  of  L and  ?nL  (or  S 
and  r’n  S)  occur  at  the  same  parameter  values  p (or  p).  So,  to  simplify  calculations, 

2n  L (or  t’n  S),  rather  than  the  likelihood  function  L itself  (or  sum  of  squares  S),  is 
maximized.  In  particular,  when  L is  a product, 


L=  7 T f(X.),  t’nL  = i?n  7T  f(X. ) 
i=l  1 1=1  1 


In  f(X. ) . 


This  follows  since  the  logarithm  of  a product  equals  the  sum  of  the  logarithms,  and  is 
quite  useful  to  actual  calculations. 


4.1.2  Numerical  Methods  of  Solution 


Due  to  the  multiplicity  of  the  parameters  in  the  models,  simultaneous  equations 
must  be  solved  in  order  to  find  the  MLEs  and  LSEs.  The  Newton-  Kaphson  method 
of  finding  roots  to  simultaneous  equations  (in  some  cases  the  relevant  first  partial 
derivatives  equated  to  zero)  was  used  frequently.  This  is  an  iterative  process  which 
uses  tangent  lines  as  approximations  to  the  curve  y F(X)  in  order  to  seek  out  the 
roots  of  the  equation  F(X)  0.  The  derivative,  F'(X),  of  the  function  is  used  in 
choosing  the  next  approximation  to  the  root  itself  through  the  recursive  relation 


F<Xi> 

X x + , — 

i+l  i F'(X.)  • 

An  error  bound  « > 0 is  needed  to  terminate  the  process  when  |Xj  + l — X| | < « . The 
root  is  then  approximated  to  lie  Xj+i,  the  value  to  which  the  successive  approximations 
have  converged.  Values  of  c on  the  order  of  10“^  to  10-,)  were  used  in  this  study. 

4.1.3  Estimation  for  the  Poisson  Models 

Computerized  methods  of  obtaining  estimators  for  the  parameters  of  the  "Poisson" 
type  models  described  in  Section  3.1  were  used.  In  this  case  it  was  assumed  that 
each  Nj  (number  of  errors  observed  in  the  ith  time  interval)  had  a Poisson  distribution 
with  mean 


\ = <J>(N-Mj_,)  r . , 

with  the  two  particular  cases  of  o = 1 (J-M)  and  o = 2 (S-VV). 
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Here  the  LSEs  are  values  N and  o (and  a in  the  three  parameter  case)  which 
minimize  the  sum 

k 

■-2 

i=l 


N.  - 0( N-M.  ,)t.® 
i l-l  i 


The  LSEs  are  the  solutions  to  the  equations  obtained  by  setting  the  first  partial 
derivatives  of  S (with  respect  to  N,  0,  and  a)  equal  to  zero.  In  the  two  parameter 
case  this  results  in  the  least  squares  equations: 


jdS  dS  . 
d N ' dO 


0 


where 


as 

dN 


k k 

-20%  N.  t°  + 2<J>2  % (N-M.  ,)  t.  lo 

Z-  l l 1-1  l 

i=l  i=l 


dS 

dO 


k 

N.  (N-M.  .It.0,  + 20%  (N-M.  ,)2  t 2a 

i i-l  i ^ i-l  i 

i=I 


and  a = 1 or  2 . 


In  the  three  parameter  model,  the  least  squares  equations  are: 

= = dS  = o 

dN  dO  da 

where 


dS 

dN 


k k 

-2®  I Ni  T l"  + 2®2  S 

i=l  i=l 


(N-Mj.jjr 


2 a 


dS 
d 0 


k k 

-2  V N.(N-M.  .It.0,  +20%' 

J -»  1 1 “ 1 1 / j 

i=l  i=l 


(N-M^])2 
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» . » . «. * * 


^ 


«.  _ ... . 


WJ-' 


da 


° * ' 202  S 2 20 


o r-* 

— = -20  ^ Ni(N-M._i)T“  ?nT.  + 20“  2 (N-M,_1)“t.““  fnr.. 


i=l 


i=l 


The  iterative  procedure  for  solving  these  equations  requires  starting  points  for  N 
(and  a in  the  three  parameter  case),  with  the  starting  point  for  N being 


5 2 Ni 


i=i 

(the  total  number  of  errors  already  observed). 

In  the  two  parameter  cases  (i.  e.  a known)  o was  eliminated  from  the  least  squares 
equations  leaving  one  equation  to  solve  for  S using  the  Newton-Raphson  method,  © was 
then  computed  using  f?  and  one  of  the  original  least  squares  equations.  For  the  three 
parameter  case  the  parameter  o was  eliminated  from  the  least  squares  equations  and  a 
two-dimensional  Newton-Raphson  method  was  used  to  find  N and  a (having  N and  a,  © 
was  easily  computed  using  one  of  the  original  least  squares  equations).  It  was  assumed 
that  the  estimate  of  N need  not  be  an  integer  (although  Kf  must  satisfy) 


N > 


k 

V 

i=l 


N.). 


This  adjustment  in  assumptions  was  necessary  to  facilitate  computations.  Nearest 
integer  values  of  N are  adopted  in  tabulating  obtained  results.  This  convention  was 
used  throughout  the  models. 

The  MLEs  are  those  estimates  N and  6 (and  a in  the  three  parameter  case)  winch 
maximize  the  logarithm  of  the  likelihood  function, 


k ( 

/n  L = Y !n  f i" ° 


I L 

i=l  1 


0(N-Mj_j ) t 


N.  - 0(N-M.  , ) t 

i 1-1 


a 


n,:(  . 


In  particular,  in  the  two  parameter  case,  the  MLE's  are  the  solutions  to  the 
likelihood  equations: 


d l n L din  L 


dN 


d O 


= 0 
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whe  re 


din  L 

Ni  $ y 

dN 

Z N 

-«,-i  4 2,  t 

i=l 

i=l 

k 

k 

d In  L 

= — V 

N.  -Y  (N-M. 

i l 

d 0 

0 <— 

1 or  2 . 

i=l 

i=l 

a 


a 


In  the  three  parameter  ease,  the  MLE's  were  the  solutions  to  the  likelihood  equations: 


where 


di?n  L 

d i n L 

di?n 

L 

0 

dN 

do 

do 

k 

k 

din  L 

- Y _ 

Ni 

Y T « 

dN 

— N 

"Mi-1 

— Q 

L-,  i 

i=l 

i=l 

k 

k 

din  L 

dO 

= _L  Y 

o 

N.  - 

i 

£ 

i = l 

i~l 

k 

k 

di  n L 

dOr 

l S') 

II 

. t?n  t. 

i i 

- 6 

M 

i 

T 

i 1 

i-i 
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The  iterative  procedure  for  solving  these  equations  requires  starting  points  for  N 
(and  a in  the  three  parameter  case),  with  the  starting  point  for  N being 

k 

2 Y N.. 

i=l 

As  with  the  LS  method,  in  the  two  parameter  case  0 was  eliminated  from  the  likeli- 
hood equations  leaving  one  equation  to  solve  for  N using  the  Newton-Raphson  method.  $ 
was  then  computed  using  $ and  one  of  the  original  likelihood  equations.  In  the  three 
parameter  case  O was  again  eliminated  from  the  likelihood  equations,  leaving  only  two 
equations  to  solve  for  N and  cit  using  a two-dimensional  Newton-Raphson  method  ($  was 
computed  using  N,  a and  one  of  the  original  likelihood  equations). 
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,1-m,  |A»|  mAiA 


4.1.4  Estimation  for  the  Binomial  Models 


For  the  binomial  models  described  in  Section  3.3,  where  the  number  of  observa- 
tions, Nj,  is  assumed  to  have  a binomial  distribution  with  parameters  p,  and 
two  separate  cases  were  considered: 


i)  p.  = L 

Hi  t.  + 


...  , -ar, 

u)  pj  = 1-e  i 

with  both  c and  a assumed  positive. 

Here  the  least  squares  method  finds  those  estimates  N and  c (or  a)  which 
minimize  the  sum 


s 

i=l  L 


N,  - (N-M^jjp, 


where 

i-1 

M.  , = N N .,  i=2 k,  M = 0. 

i-l  Z-  ) 0 

j=l 

Specifically,  for  the  binomial  models,  the  LSEs  are  the  solutions  to  the 
equations: 


as  us 


dN  a c 
For  the  model  having; 


= 0 or 


a s a s 

dN  “ 9a 


0. 


P: 


i t.  + c ’ 

i 


the  partial  derivatives  are  given  by: 

k k 

+ 2 

l T T C 

1=1  ‘ i=l 


as 

aN 


■ -i 2 1 Ni  Trh  * 2 I 


T . 


T.  + c 
i i > 
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k 

k 

as 

2 V 

(T.  + C) 

-2  ^ (N- 

dc 

i=l 

i-1 

model  having 

, -ar. 

l5i  = 

1-e  i 

» 

k 

k 

dS 

-2  V 

N.  <l-e“ari)  + 2 

1 

^ (N-M.  , 

dN 

. 

i-l 

i 1 

i=l 

k 

k 

dS 

-2  \ 

N.t.(N-M.  ,)e_aT 

1 i l-l 

1 + 2 ^ T 

<3  ii 

i 1 

i-l 

T -Z 


l_1  (T.+C)'5 


-ai.. 


-ai  -aT. 


The  iterative  procedure  tor  solving  these  equations  requires  starting  points  for 
both  N and  c (or  a),  with  the  starting  point  for  N being 

k 

> v N 


and  the  starting  point  for  c or  a being  0.  A detailed  discussion  of  these  and  other 
starting  points  is  given  in  Section  4.2.2. 

The  maximum  likelihood  method  finds  those  values  N and  c (or  a)  which  maximize 
the  logarithm  of  the  likelihood  function, 


In  L = 


N-M; 


i-1 


N. 


P: 


N 


(1 


N-M. . 

■to  i 


Due  to  the  computational  difficulties  and  computer-time  requirements  MLEs  were  not 
calculated  for  the  binomial  models. 


4.1.5  Estimation  for  the  IBM  Growth  Model 

The  computerized  methods  of  obtaining  estimators  for  the  model  of  the  non-homo- 
geneous  Poisson  Process  described  in  Section  3.2  follow.  Here  the  number  of  errors 
occurring  through  time  tj,  N(tj),  is  considered  to  have  a Poisson  distribution  with 
mean  m(t.)  = a(l-e“^i)  with  a and  b positive.  Also  N(t^)  = 0 and  t^  = 0. 
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. 


Here  the  LSEs  are  those  values  a and  E which  minimize  the  sum 

k f 1 2 


s = ^ N(t.)  - N(tl_1)  - ^ m^)  - m(ti_1) 


In  particular,  the  LSEs  are  the  solutions  to  the  equations: 

as  as 


3a  3b 


where 


as  _ 

-2  y 

aa 

i=l 

k 

as 

-2  ^ 

db 

i=l 

I -bt.  -bt.'T 
( N(t,)  - Nltj.j)  ) - a 'e  1_  -e  I 


-“i-i  -bti) 
e -e 


-bt.  . -bt.' 

l-l  l , 

-e  ) 


^ la  (•> 


-bt.  -bt.  ,\ 

e - tj.j  e ) 


The  parameter  a was  eliminated  from  the  two  equations  above  leaving  one  equation 
to  solve  for  b using  the  Newton-Raphson  method.  The  estimate  a is  then  calculated 
from  the  equation: 


The  MLEs  are  those  values  a and  8 which  maximize  the  logarithm  of  the  likeli- 
hood function. 


k exp  ( - (m(t .)  - m(t.  .)))fm(t  ) - m(t  ))  i 

In  L = In  7 T 5 — 

i=l  \ • 


where 


Zj  = N(t.)  - Nft^p  , i=l k,  tQ  = 0 . 
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Here  the  MLEs  are  the  solutions  to  the  equations: 

din  L _ din  L _ 

3a  3b 
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The  parameter  a was  eliminated  from  the  two  equations  above  leaving  one  equation  to 
solve  for  b using  the  Newton- Raphson  method.  The  estimate  a was  then  computed 
from  the  equation: 


4.2  PROPERTIES  OF  ESTIMATES 
4.2.1  Ease  of  Calculation 

Throughout  the  applications  of  the  models,  there  were  varying  degrees  of  ease  of 
calculation.  Often,  calculations  were  simplified  by  approximating  derivatives  in 
order  to  apply  the  Newton-Raphson  method.  This  made  computer  computations  easier 
(as  well  as  less  time  consuming  and  expensive),  but  also  gave  slightly  less  accurate 
estimates.  Derivative  approximations  were  used  to  find  estimates  in  the  three  param- 
eter Poisson  model,  both  binomial  models  and  the  IBM  growth  model. 

The  greatest  difficulty  in  implementing  a method  to  find  parameter  estimates 
occurred  in  applying  the  ML  method  to  the  binomial  models  of  Section  3.3.  Here 
finding  estimates  for  the  parameter  c (or  a)  which  maximized  the  likelihood  function 
for  corresponding  integer  N's  (over  all  values  of 

k 

N > ^ N.) 

i=l 

proved  much  too  computer  time  consuming  and  costly  for  practical  use.  This  also 
suggests  that  testing  only  integer  estimates  of  N would  prove  to  be  impractical  for 
the  other  models  also. 
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Another  computational  problem  arises  in  applying  a second  derivative  test  to 
insure  attaining  the  proper  extremum  for  the  three  parameter  Poisson  model. 

These  computations  become  extremely  complicated,  and  may  produce  inaccurate 
results  when  applied  to  the  approximations  of  the  estimators  N,  a,  and  $ (or  ft,  5 
and  $).  This  problem  may  also  arise  in  applying  a second  derivative  test  for  extrema 
to  two  parameter  cases. 

Specifically,  a discrepancy  arose  in  applying  a second  derivative  test  to  a fit  of 
the  J el insk i- Moranda  model  (ML  method)  to  data  set  number  6-fd.  Here  the  esti- 
mates N 113, j94  and  0 8.95  x 10  were  evaluated  to  lx1  maximizing  estimates 

of  the  parameters;  while  the  values  N = 113,  594  and  $ = 9.0  x 10~<>  were 
considered  to  act  as  a saddle  point.  This  difference  in  round-off  error  on  the  order 
of  10  may  illustrate  the  existence  of  severe  precision  problems  in  computing 
parameter  estimates. 


4.2.2  Convergence  Difficulties 

Throughout  the  analysis  difficulties  in  achieving  convergence  of  the  estimates 
were  encountered.  One  of  the  most  acute  problems  is  choosing  satisfactory  starting 
points  as  inputs  (required  to  initiate  the  Newton- Haphson  method  of  solution).  A 
basic  problem  with  starting  points  is  that  none  of  the  literature  discussing  the  various 
software  reliability  models  contain  any  conventions  for  obtaining  starting  points, 
although  iterative  processes  are  usually  cited  for  use  in  solving  the  relevant  equations. 

. 

The  most  severe  cases  of  starting  point  problems  occurs  in  the  models  requiring 
two  simultaneously  input  starting  points.  Some  cases  with  this  problem  are  shown 
in  Table  4.2.2 .1  . 

This  table  shows  that  small  differences  in  the  input  starting  points  may  result 
in  vast  differences  of  attaining  convergence.  This  is  especially  true  for  the  values 
in  the  last  two  rows  of  Table  4.  2. 2. 1. 

Observation  shows  that  the  choice  of  the  starting  point  is  more  critical  in  apply- 
ing the  ML  method  to  a particular  model  than  in  applying  the  LS  method.  This  may 
. lx?  due  to  the  need  to  solve  less  complicated  equations  in  the  latter  case. 

The  possibility  of  the  existence  of  more  than  one  set  of  solutions  to  the  likelihood 
(or  sum  of  squares)  equations  is  seen  most  clearlv  in  applications  of  the  IBM  model. 
Here  it  was  found  that  both  the  LS  and  ML  method  could  lead  to  convergence  to  two 
(or  more)  different  sets  of  estimates  - one  set  having  a finite  positive  value  for  a-  (or 
a),  and  a positive  value  for  b (b);  the  other  having  the  iterative  approximations  for  a 
(a)  increasing  without  Ixxind,  the  values  for  6 (I?)  tending  to  zero,  and  the  product 
ab  (ab)  approaching  a constant  value.  Upon  encountering  the  latter  case,  changing 
starting  points  could  sometimes  result  in  the  former  case  of  finite,  non-zero  esti- 
mates of  the  parameters.  This  use  of  change  in  starting  points  is  illustrated  in 
Table  4 .2 . 2 .2 . 

The  choice  of  starting  points  appeared  to  be  the  least  critical  in  applying  the  LS 
method  to  the  Jelinski-Moranda  and  Shick-Wolverton  models  of  Section  3.1 . Because 
of  this  situation,  in  trying  to  fit  a particular  data  set  to  the  models,  it  may  be  helpful 
to  try  finding  parameter  estimates  for  these  models  before  applying  the  other  models 
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Table  4.2.2. 1.  Starting  Point  Analysis  for  the  Binomial  Model 


BINOMIAL  1 


Starting 

Points 

Result 

N 

c 

500 

15 

N on  - Con  ve  rgence 

550 

25 

N on  - Con  ve  rgence 

450 

200 

N on - Con  ve rgen  c e 

500 

15 

Convergence  to  invalid  estimates 

4 Of) 

75 

CONVERGENCE  TO  VALID  ESTIMATES 



BINOMIAL  11 

Starting  Points 

Result 

_ . - 

N 

a 

3600 

0.1 

Program  interrupt  due  to  error 

3500 

0.01 

Convergence  to  invalid  estimates 

3600 

, 0.009 

Convergence  to  invalid  estimates 

3599 

! 0.008 

1 

CONVERGENCE  TO  VALID  ESTIMATES 

(or  MLE  method).  In  doing  so,  more  efficient  starting  points  for  the  estimates  of 
the  parameter  N for  the  other  models  may  be  suggested. 

4.2.3  Types  of  Convergence  Observed 

Different  types  of  convergence  were  encountered  for  the  various  models.  Although 
monotone  convergence  (e.g.  usually  increasing  for  estimates  of  N)  was  most  common, 
the  path  of  the  iterations  was  not  always  predictable.  Oscillation  was  a common 
occurrence  in  applying  the  ML  method^of  seeking  estimates  for  the  IBM  model  of 
Section  3.2.  Here  when  estimates  of  b tended  toward  zero,  oscillation  with  changes 
in  sign  alxnit  zero  often  resulted.  Notice  that  this  shows  there  is  no  practical  method 
of  choosing  starting  points  which  will  always  result  in  positive  estimates  for  a and  b. 

Table  4. 2. 2. 2.  Starting  Point  Results  for  the  LSEs 
for  the  1 BM  Model  - Data  Set  14-fd 


Starting  Point  (b) 

a 

! 5 

0.9 

-2.11 

i 

-0.6025 

| 

0.01 

, 1.52  x10s 

2.15  x 10~7 

0 . 55 

J 

353.70 

0.3510 
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Results  for  Data  Set  14-fd  show  different  starting  points  for  b to  yield  widely 
varied  parameter  estimates.  Using  the  starting  estimate  of  0.  55  for  b,  however, 
appears  to  result  in  the  most  appropriate  parameter  estimates. 

The  only  model  found  to  result  in  convergence  at  all  times  is  the  IBM  model  of 
Section  3.2.  Here  both  the  LS  and  ML  methods  always  gave  parameter  estimates. 

The  proljable  explanation  for  this  "guaranteed"  convergence  is  that  the  iterations  of 
only  one  parameter  estimate  (those  of  I)  or  b)  are  compared  in  order  to  determ ine 
convergence.  This  results  in  the  case  where  the  estimates  of  a (a)  diverge  (increase 
in  absolute  value  without  lx>und)  being  considered  "convergent".  However,  other 
models  treat  similar  cases  (e.g.  estimates  of  ft  increasing  without  bound,  while  esti- 
mates of  c converge  to  a finite  value)  as  not  being  convergent. 

Convergence  criteria  should  be  standardized  for  all  of  the  estimates  and  the 
models.  Without  a uniform  treatment  of  concepts  such  as  convergence  it  can  be  mis- 
leading to  compare  properties  (such  as  frequency  ot  convergence)  of  the  various 
models . 

Even  with  convergence  achieved,  the  resulting  set  of  parameter  estimates  may 
not  actually  be  the  desired  extremum.  A listing  of  the  actual  numbers  of  proper 
extrema  found  for  each  model  is  shown  in  Table  4.  3. 1. 

Second  derivative  testing  did  show  the  existence  of  some  saddle  points  as  esti- 
mates for  several  models.  Subsequent  computer  testing  could  not  always  result  in 
convergence  to  other  parameter  estimates  for  these  data  sets,  leaving  the  question 
of  whether  an  actual  extremum  can  be  found  still  open. 

However,  the  validity  of  the  second  derivative  test  on  the  resulting  approximations 
of  the  parameter  estimates  should  not  l>e  considered  absolute,  as  illustrated  by  the 
previously  stated  example  (given  in  Section  1.2.1)  where  small  round-off  error  for 
the  estimates  led  to  two  different  conclusions  for  the  test. 


4.2.4  Convergence  to  Invalid  Estimates 

There  is  also  a problem  of  tendencies  toward  convergence  to  parameter  estimates 
(even  proper  extrema)  which  violate  the  model  assumptions. 

For  most  models  (all  but  the  IBM  model  of  Section  3.2),  convergence  to  estimates 
having  ft  (ft)  less  than  the  total  number  of  errors  observed, 


occurred  frequently.  These  estimates,  however,  were  not  considered  to  be  I ts  of  the 
model. 


In  the  three-parameter  GPM  of  Section  3. 1,  convergence  with  the  resulting 
estimate  ri  <•  0 (or  a < 0)  was  common.  This  would  mean  smaller  expected  error 
rates  over  longer  time  intervals.  Nonetheless,  these  results  were  considered  to  be 
plausible  fits.  It  should  be  pointed  out  that  changes  of  some  data  bases  from  time 
units  of  days  to  units  of  hours  changed  only  the  estimates  of  o for  fits  to  the  Poisson 
models  of  Section  3.  1.  The  remaining  estimates  of  N (and  a?)  were  unaltered.  That 
is,  they  were  invariant  under  changes  of  scale. 

Finally,  the  oscillation  sometimes  encountered  in  applying  the  ML  method  to  the 
IBM  model  of  section  3.2  sometimes  resulted  in  negative  estimates  for  !x>th  a and  b. 
This  contradicts  the  assumption  that 

a = lim  E(N(t)) 

t — 03 

the  expected  total  number  ol  errors  to  be  found  in  the  software.  This  problem,  how- 
ever, could  usually  lie  eradicated  by  trying  other  starting  points. 


4.3  EXISTENCE,  I Nigi'ENESS  AND  CONVERGENCE  PROPERTIES  OF  THE 

ESTIMATORS 

The  statistical  and  numerical  methods  for  obtaining  parameter  estimators 
described  in  Section  1.1  may  not  always  yield  usable  results.  First,  there  may  be 
conditions  under  which  the  statistical  theory  leads  to  a situation  in  which  the  estimator 
does  not  exist.  Estimators  which  are  not  unique  arc  also  a problem.  That  is,  the 
extreme  value  of  the  function  may  result  from  different  sets  of  parameter  estimates. 
For  example,  in  the  case  of  the  MLE's,  there  may  be  more  than  one  value  of  the 
estimator  which  yields  an  identical  value  for  the  maximum.  In  this  situation  there 
is  no  way  to  know  which  estimate  to  choose,  or  even  if  all  maximizing  estimates  have 
beer,  found.  Finally,  the  numerical  methods  employed  may  not  always  lead  to 
convergence. 


1.3.1  Existence  ol  the  Estimators 

II  an  estimator  fails  to  exist  at  all  times,  there  will  be  instances  where  another 
estimator  (which  does  exist!  must  lie  used  instead.  The  limitations  of  an  estimator 
which  may  not  exist  make  it  impossible  to  rely  on  these  estimators  for  all  samples. 


In  the  J-M  model  described  in  Section  3.  1,  for  example,  the  general  criteria  for 
existence  of  the  maximum  likelihood  estimator  have  not  been  established.  However, 
a specific  case  where  the  MLE  of  N in  the  J-M  model  does  not  exist  has  been  found. 

Consider  the  case  of  only  two  positive  time  intervals  Tj  and  r g,  with  exactly 
one  error  in  each  interval.  Let  e be  the  positive  constant 
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With  the  parameter  O eliminated,  the  likelihood  function  of  N for  this  case  becomes 


L(N) 


4c  N(N-1) 

[N  + c(N-l)]2e2 


where  e is  the  usual  base  of  natural  logarithms. 


The  ratio,  c,  of  the  sizes  of  the  two  time  intervals  is  crucial  here.  Notice  that 
any  estimator  of  N must  be  > 2. 

i)  If  0<c  <1,  L(N)  (as  shown  in  Figure  4.3.1a)  is  a monotone  increasing 
function,  and  therefore  the  MLE  does  not  exist.  In  this  example,  then, 
if  the  error-free  time  to  the  second  observation  is  no  larger  than  that 
to  the  first,  the  maximum  likelihood  method  of  finding  an  estimator  for 
N will  not  yield  a solution. 

ii)  Further  calculations  show  that  if  c >\  3,  the  integer  %alue  (since  N is 
an  integer,  so  must  its  MLE,  N,  be)  which  maximizes  the  likelihood 
function  is  N =2,  the  total  number  of  errors  already  olserved  (see 
Figure  4.3.1b).  This  is  not  an  invalid  estimate,  but  may  be  somewhat 
impractical.  Here,  with  only  two  errors  observed  it  is,  in  most  cases, 
unlikely  that  they  represent  all  of  the  errors. 

iii)  Finally,  if  1 < c < \3  the  MLE  will  lx?  an  integer  greater  than  2,  depend- 
ing on  the  exact  value  of  c.  That  is,  the  MLE  exists  and  indicates  that 
undetected  errors  still  remain  (see  Figure  4.3.1c). 

The  establishment  of  existence  criteria  for  the  MLE  in  this  example  was  due 
largely  to  the  simplicity  of  the  data  base  (i.e.  two  intervals).  In  more  complicated 
cases,  existence  of  the  estimators  has  yet  to  be  established.  However,  the  above 
result  must  not  be  underrated.  Although  obtained  under  conditions  of  a very  small 
data  base,  it  establishes  that  there  is  no  guarantee  of  the  existence  of  the  MLE  in 
any  situation. 


Valid  estimators  also  will  not  exist  if  their  resulting  values  are  contrary  to  the 
model  assumptions  (these  assumptions  are  discussed  in  Section  3).  Many  such  contra- 
dictions were  encountered  in  actually  testing  the  data  bases  of  Section  2. 

One  contradiction  occurs  when  the  estimator  of  the  total  number  of  errors  in  the 
software,  N (or  51),  is  smaller  than  the  actual  numlx'r  of  errors  already  observed, 

k 

I «,■ 

i=l 

Both  the  ML  and  LS  methods  were  subject  to  this  problem.  More  severely,  even 
negative  estimates  for  model  parameters  assumed  to  be  positive  were  obtained.  This 
is  not  an  uncommon  situation  in  statistics.  The  moment  estimators  often  result  in 
negative  estimates  for  the  variance.  This  is  one  reason  why  moment  estimators  are 
rarely  used. 
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In  the  binomial  model  with  parameter  p,  = l-e-aTi  (described  in  Section  3.3), 
both  the  ML  and  LS  methods  of  estimation  could  result  in  the  estimator  a (or  3)  being 
; egative.  Since  the  time  interval,  t is  always  positive,  this  gives  a negative  value 
for  the  binomial  parameters,  pj,  which  is  contradictory  to  the  assumptions  of  the 
binomial  model  and  is  practically  absurd. 

Frequently,  in  the  three  parameter  Poisson  type  model  deseril>ed  in  Section  3.1, 
the  resulting  estimator  obtained  for  a (through  lx>th  ML  and  LS  methods)  was  negative. 
Although  this  may  present  problems,  it  was  not  considered  an  invalid  result. 


4.3.2  Determination  of  Extrema 

Most  of  the  methods  described  in  Section  4.1  for  obtaining  estimators  used  only  a 
first  derivative  test,  which  does  not  necessarily  yield  the  proper  extremum.  That  is, 
a maximum  value  rather  than  a minimum  value  may  lx.*  found  in  using  this  method  of 
finding  the  LSEs  (similarly,  a minimum  found  as  the  MLEs).  Saddle  points  (critical 
values  satisfying  the  first  derivative  test,  but  neither  minima  nor  maxima)  could 
also  occur. 

A computer  program  which  further  tested  all  resulting  estimators  for  the  proper 
extremum  using  second  derivative  tests  was  designed.  Tlx*  results  are  shown  in 
Table  4.3. 1.  The  test  did  show  some  of  the  estimators  obtained  to  be  saddle  points. 


TABLE  4.3. 1.  VERIFICATION  OF  PROPER  EXTREMA  IN  LSEs 
AND  MLEs  FOR  CASES  OF  CONVERGENCE 


Model 

No . Ptopel*  Ext  rema 
(Minimum  for  LSE,  Maximum  for  MLE) 

No.  Saddle” Points 

IBM  (LSE) 

50 

4 

IBM  (MLE) 

4(> 

7 

J-M  (LSE) 

38 

0 

J-M  (MLE) 

39 

3 

Binomial  1 (LSE) 

37 

0 

Binomial  II  (LSE) 

32 

0 

S-W  (LSE) 

20 

2 

S-W  (MLE) 

23 

0 

GPM  (LSE) 

36 

o 

GPM  (MLE) 

38 

2 

4-16 

It  is  possible  that,  when  a saddle  point  or  improper  extremum  results,  a change 
in  starting  points  could  yield  the  proper  extremum.  This  situation  is  discussed 
further  in  Section  4.2.2. 


•1.3.3  Convergence  of  the  Newton -Raphson  Method 


In  the  actual  use  of  the  Newton- Raphson  method,  convergence  of  the  estimators 
to  finite  values  could  not  always  be  obtained.  The  major  problem  seemed  to  be  in 
finding  successful  starting  points  for  the  parameter  estimates  as  inputs  to  the  pro- 
grams. In  general,  no  real  guidelines  were  found. 

Some  starting  points  led  to  divergence  of  the  estimates  of  N (or  N),  i.e.  the  suc- 
cessive approximations  of  N (or  N)  would  increase  steadily  without  bound.  This  could 
sometimes  be  corrected  bv  decreasing  the  initial  value  of  N. 

Some  starting  points  would  give  convergence  to  invalid  values  for  the  estimators, 
while  changing  the  starting  points  sometimes  gave  valid  estimates  of  the  parameters. 
(See  Tables  4.  2.  2.  1,  4.  2.  2.  2).  Often  in  the  Poisson  models  of  Section  3. 1 a data  set 
resulting  in  a negative  value  for  N (or  fi)  could  yield  a valid  estimate  for  N if  its 
starting  point  was  sufficiently  increased.  The  possibility  exists  that  different  starting 
points  can  lead  to  convergence  to  more  than  one  valid  set  of  estimators  for  a single 
set  of  data.  In  such  cases  there  was  no  measure  found  guaranteeing  which  values  were 
more  appropriate. 

Particular  difficulty  in  finding  useful  starting  points  for  the  parameter  estimates 
occurred  in  the  three  parameter  Poisson  model  of  Section  3.  1 and  the  least  squares 
version  of  the  binomial  models  of  Section  3.  3.  Here  both  starting  points  (initial  values 
toy  the  estimates  of  N,  a;  N,  c;  or  N,  a)  must  be  coordinated  in  order  to  succeed  in 
obtaining  convergence  for  the  estimates.  In  the  binomial  case,  even  if  convergence 
/has  been  obtained,  reusing  those  estimators  as  input  starting  points  for  the  same 
model  and  method  on  the  same  data  set  will  not  always  result  in  convergence  again. 

This  definitely  presents  problems  in  choosing  initial  values  to  input. 

for  the  J-M  and  S-W  models  of  Section  3. 1,  convergence  seems  somewhat  easier 
to  obtain  when  using  the  least  squares  method  than  with  the  maximum  likelihood 
method.  Convergence  in  those  computer  programs  using  the  LS  method  was  usually 
faster,  i.e.  fewer  iterations  were  needed  in  the  convergent  sequence  of  parameter 
estimates. 

Convergence  also  is  frequently  more  direct  in  the  programs  using  the  least 
squares  method,  e.g.  a monotone  convergent  sequence  of  parameter  estimates  rather 
than  an  oscillating  sequence. 


4.3.4  Uniqueness  of  the  Estimators 

The  uniqueness  of  either  the  maximum  likelihood  or  least  squares  parameter 
estimators  cannot  be  insured  as  the  following  example  illustrates. 
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Consider  the  MLE  of  N for  the  particular  case  of  the  J-M  model  of  Section  3.  1. 

Here,  assume  two  positive  time  intervals  t and  r with  the  positive  constant 

1 ** 


C = t2/tx 

as  before.  Assume  no  errors  observed  in  the  first  time  interval,  r i,  and  one  error 
detected  in  the  second  interval,  73.  With  the  parameter  0 eliminated,  the  likelihood 
function  evaluated  for  this  case  becomes 


L(N) 


c 

(c  + 1)  e 


which  is  a constant  independent  of  N.  Therefore,  since  any  value  of  N maximizes  the 
likelihood  function,  the  MLE  of  N in  this  case  is  not  unique  whatever  be  the  ratio  c. 


The  possibility  of  a multimodal  likelihood  function  (or  ?nore  than  one  local  mini- 
mum of  the  sum  of  squares  function  S)  cannot  be  eliminated,  either.  The  present 
numerical  techniques  of  finding  the  parameter  estimators  cannot  accommodate  this 
situation  in  finding  an  overall  maximum  (or  minimum)  value.  Specifically,  in  the 
present  ML  method  of  finding  parameter  estimators  for  the  binomial  models  (maxi- 
mizing the  likelihood  function  lor  each  integer  value  of  N)  multimodal  situations  were 
observed.  Therefore,  the  required  size  of  a sufficiently  large  testing  range  for  N 
could  not  be  established. 


No  method  of  guaranteeing  the  uniqueness  of  either  the  MLEs  or  LSEs  has  been 
found.  The  numerical  techniques  presently  used  in  finding  these  estimators  can 
detect  only  one  set  of  parameter  estimates  which  satisfy  the  relevant  equations,  and 
do  not  eliminate  the  existence  of  other  estimates  which  may  yield  the  same  (or  pos- 
sibly lx?tter)  value  of  the  likelihood  function,  L,  or  sum  of  squares,  S.  Without  a 
guarantee  of  uniqueness,  it  may  be  advisable  to  continue  testing  a model  for  a single 
data  set,  even  after  a suitable  set  of  parameter  estimates  has  been  obtained. 

If  criteria  for  the  existence  of  the  estimators  can  lie  established,  some  of  the 
difficulties  in  obtaining  covergence  for  the  estimators  might  be  eliminated.  One 
idea  is  that  the  observed  rate  of  increase  (or  decrease)  of  errors  could  possibly 
be  used  to  establish  bounds  for  values  of  the  estimators.  This  could  aid  in  the  se- 
lection of  the  proper  estimate  in  a multinodal  (more  than  one  maximum)  situation 
or  conserve  efforts  in  cases  where  nonexistence  of  the  estimators  has  been  deter- 
mined. This  idea  has  not  been  studied  here.  Additionally,  such  criteria  could 
possibly  aid  in  finding  suitable  starting  points  for  the  estimates.  In  particular, 
determining  bounds  on  the  estimators  could  lead  to  a coordination  of  the  input 
values  necessary  in  the  models  requiring  two  starting  points. 


It  is  significant  to  notice  that,  at  present,  when  parameter  estimators  cannot  be 
obtained  for  a particular  data  set,  there  is  no  way  to  distinguish  between  the  nonex- 
istence of  these  estimators  and  failures  in  the  numerical  method  of  solution  (e.g. , 
improper  starting  points). 
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4.4  SMALL  AND  LARGE  SAMPLE  PROPERTIES  OF  THE  ESTIMATORS 


4.4.1  Lai  ‘ Sample  Results:  MLEs 

4. 4. 1.1  The  GPM,  S-W  Model,  and  J-M  Model 

True  asymptotic  results  for  the  MLEs  and  LSEs  for  the  GPM,  J-M  and  S-W 
models  are  not  valid  unless  the  assumption  is  made  that  not  all  of  the  N initial  errors 
can  ever  tie  removed.  That  h>,  we  must  assume  that  there  are  numbers  JVI  and  M such 
that  for  all  i,  0<  N1  ? Mj-i  ?M<  N . if  this  is  not  the  case,  the  successive  observed 
errors  Nj,  N2^  . ..  will  eventually  degenerate  to  the  value  zero.  Indeed,  no  matter 
how  much  debugging  time  is  spent,  errors  still  occur  although  they  are  usually  rare 
in  a delivered  software  package.  The  existence  of  the  number  M then  appears  reason- 
able and  renders  proper  asymptotic  analysis  of  the  software  models  mentioned  above 
possible . 


As  a further  assumption  it  is  required  that  there  are  timesJL  and  t such  that 
0<t^<Tj<t<co  for  all  i. 

As  mentioned  in  Section  3. 1 the  likelihood  function  for  the  sample  N . . . , is 


N. 
a . i 


£(N. , . . .,  N,  ; 0,  N,  a)  = 7 T 
1 K i=l 


k (0(N  - M.  .)T.a)  -0<N-M.  )t° 

rr  I — A I I — 1 l 

c 

n 


for  the  GPM  (a  = 1 corresponds  to  J-M,  a = 2 corresponds  to  S-W).  Denoting  the 
natural  log  of  the  likelihood  function  by  KsK(Nj,  ...,Ng:  0,N,o)  it  is  seen  from  Sec- 
tion 4.1.3  that 
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The  classical  theory  of  maximum  likelihood  (cf.(I,3)  for  example)  states  that  under 
certain  conditions,  the  distribution  of  (N,0,o)T  ( T denotes  transpose)  is  asvmptot- 
ically  multivariate  normal  with  mean  (N,  0,o)T  and  covariance  matrix 
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c-1  = 


\ ON2' 

'92K  \ 
^8N  80  / 

( 92K  \ 

- F l 

1 82K  \ 

\ dNB 0 / 

E! 

\ 8 02  / 

( s2k  ) 

-E 1 

f 82K 

\8N8a  / 

\ 8 09 0 , 

where  the  partial  derivatives  are  evaluated  at  the  actual  values  of  N,  0,  and  a . This 
result  is  true  for  the  GPM  and  can  be  proven  in  a straight  forward  manner  (see  sec- 
tions dealing  with  the  derivation  of  asymptotic  properties  of  LSEs  for  the  general 
techniques  involved).  Performing  the  partial  differentiations  and  taking  expections 
gives 


K <(>  t . a 1 k 

2 m ct>  i T°  T°  ,nTi 

i=l  1-1  i=l  i=l 


s <”-*!-!>  V 
0 
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where  the  lower  triangular  portion  is  left  blank  since  the  entries  are  the  same  as 
their  diagonal  opposites  due  to  symmetry.^  Since  N,  0,  and  a are  unknown  CT1  may  be 
estimated  by  replacing  N,  0 , and  a by  N,  0,  and  a.  It  should  be  noted,  though,  that 
doing  this  makes  C_1  a random  variable  and  it  is  no  longer  correct  to  say  that 


N - N 
0-0 
a - a 


is  approximately  normally 


d with  mean  0 and  covariance  matrix  C"1  for  large  k,  where  C 1 is  C“*  with 
a replaced  by  N,  0 and  a . All  one  can  say  is  that  C”*  is  the  maximum 


distributed  with  mean  0 and 
N,  0 and  a replaced  by  N,  < 
likelihood  estimator  of  C“* . 


The  matrix  C-1  is  difficult  to  compute  algebraically  and  it  is  recommended  here 
that  estimators  C-*  be  computed  via  a digital  computer.  The  estimators  of  the 
variances  and  covariances  of  N,  $ and  a would  then  be 

* a * — 1 
Var  (N)  = Cn 


Var  (5)  ft  C‘J 

V'ar  <&)  C;l3 

Cov  (N,  5)'=  C~l  = C-J 

Cov  <N,  5)  3=  C^l  = C-J 

cov  (5,  o)a=  c-J  = c-J 

where 


denotes  the  entry  in  the  row  and  column  of  C * and  whe^e  "a"  means  asymp- 
totically equal,  in  probability,  as  k —a. 


For  the  S-W  and  J-M  models,  the  results  are  the  same  except  a is  no  longer 
estimated  and  the  last  row  and  last  column  of  C-1  are  eliminated.  In  these  cases, 
the  inversion  is  easily  carried  out  and  it  follows  that 


where 

k k T a 

D = y (N-M.  ,)r°  V — 1 _ 

^ i-l  i (N-M.  ) 

i=l  i=l  1-1 

and  0=1  for  J-M,  a = 2 for  S-W.  As  before, 
mum  likelihood  estimators  of  these  quantities  are  obtained  by  replacing  N and  <t>  by 
N and  5.  To  verify  these  results  for  the  J-M  and  S-W  models,  the  sampling  distri- 
butions of  N,  along  with  their  asymptotic  variances  and  covariances,  were  simulated 
based  on  true  values  of  N = 300  and  d>  = 0.01 . This  was  accomplished  by  taking  k = 25 

intervals,  t[  = io  for  all  i,  and  Mj_i  = 10  (i  - 1),  i = 1,2 k.  Five  hundred  sets 

of  observations  Ni,  N2.  . . N25  were  generated  using  a Poisson  random  number  gen- 
erator. For  each  of  the  500  sets,  N,  $ were  computed  and  stored.  Histograms  of  the 
500  N's  and  $'s  were  printed  along  with  a tabulation  of  the  class  intervals.  Following 
the  class  interval  tabulations  are  the  sample  standard  deviation  and  mean  (of  the 
respective  parameter  estimator).  Finally,  the  sample  covariance  between  the  500  N's 
and  $'s  is  printed  along  with  the  asymptotic  variances  and  covariances  derived  in  this 
section  computed  for  N = 300,  M^  = 10  (i  - 1),  <t>  = 0.01,  ^ = io  for  all  i,  and  k = 25. 
These  simulations  and  calculations  are  presented  in  Figure  4. 4. 1.1  for  the  J-M  and 
S-W  models.  As  shown,  even  for  such  small  k the  asymptotic  results  are  quite 
accurate  and  approximate  normality  is  valid. 


since  N and  <t>  are  unknown,  maxi- 


4.4. 1.2  IBM  Model 

In  the  proof  of  asymptotic  normality  of  MLEs  the  key  observation  is  that  the  vec- 
tor of  partial  derivatives  of  the  log  of  the  likelihood  function  with  respect  to  the 
parameters  estimated  is  asymptotically  normally  distributed.  In  the  case  of  the  IBM 
model  (letting  K denote  the  log  of  the  likelihood  function)  it  is  seen  from  Section  4.1.5 
that 
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Figure  4. 4. 1.1.  Simulation  of  the  Distributions  ol  the  MLF.s  lor  the  J-M  and  S-W  Models 
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Figure  4.4. 1.1.  Continued 
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Figure  4. 4. 1.1.  Continued 
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ASYMPTOTIC  COV(NHAT, PHIHAT)  = -0.0014 
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li  we  assume  0 - t < t < • ■ * < t,  , 
probability  1,  u 1 


. . then  it  is  obvious  that  with 
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(see  Section  3.2  for  a discussion  of  N (<»)). 

Now  N (<*>)  has  a Poisson  distribution  with  mean  a,  and  hence  9K/8a  is  not 
normally  distributed  asympototically  so  that  the  classical  theory  for  asymptotic  prop- 
erties of  maximum  likelihood  estimators  cannot  hold  exactly.  However,  for  large  a 
(a  is  usually  quite  large  since  it  is  the  expected  total  number  of  errors  observable  in 
infinite  time)  N (x)  is  approximately  normally  distributed  with  mean  a and  variance 
a.  So,  the  vector  ,a  - ai  is  asymptotically  distributed  approximately  as  a bivariate 
(b  - b) 

normal  random  variable  w'ith  mean  / 0\  and  covariance  matrix 
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Performing  the  matrix  inversion  leads  to  the  following: 
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To  verify  these  results,  the  distributions  of  a and  b were  simulated  by  a similar 
procedure  to  that  described  in  Section  4.4. 1. 1.  The  true  value  of  a was  300  and 
that  of  b was  0.01.  Again,  k was  taken  to  be  25  and  tj  = lOi,  i = 1,  2,  ...  , 25. 
Even  though  k = 25  is  not  large,  the  results  derived  here  are  consistent  with  the 
simulation  results  shown  in  Figure  4.  4. 1.  2. 

4.4  2 L arge  Sampl e Results:  LSE 

Since  there  is  no  literature  available  concerning  the  asymptotic  properties  for  the 
least  squares  estimators  a general  procedure  will  be  derived  here  which  will  be 
applied  to  the  J-M  and  S-W  and  IBM  models  in  Sections  4.4.2. 1 and  4. 4. 2. 2, 
respectively.  Due  to  the  complications  existing  in  the  Binomial  models  (e.g.,  suc- 
cessive observations  are  not  independent)  the  asymptotic  properties  were  not  derived. 
For  the  three  parameter  GPM  the  algebraic  manipulations  and  calculations  are  too 
unwieldly  to  be  included  here  but  may  be  carried  out  on  a computer.  The  technique 
for  three  parameters  is  the  same  as  that  for  two  parameters  which  will  now  be 
illustrated. 


Let  Xi,  . . . Xfc,  ...  be  independent  random  variables  with  finite  moments 

E (xi)  = Ri  (6l.  02>  . Var  (Xj)  = ffj  (0i,  02>  E | Xj  - gi  (Gl,  02>  | 3 = Pi  (01.  02) 
satisfying  Liapunov's  condition  [c.f.  (1,2) 
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for  each  finite  value  of  the  unknown  parameters  01  and  O2.  It  is  assumed  that 
gi(0l,  02)  is,  for  each  i,  three  times  continuously  differentiable  in  a neighborhood 
of  the  true  value  of  0 2)^.  Let 
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and  define  0^,  02  (dependence  on  k has  been  suppressed  for  notational  convenience) 
such  that 
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Figure  4. 4. 1.2.  Simulation  of  the  Distributions  of  the  MLEs  for  the  IBM  Model 
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Figure  4. 4. 1.2.  Continued 
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Under  the  additional  assumptions  that  as  k — ® , 


0 


The  matrix  \ is  asymptotically  equal  to 


-1 


in  probability.  The  matrix  N is  thus  asymptotically  equal  (in  probability)  to 


where 


L 


The  general  discussion  in  Section  4. 4. 2 can  now  be  applied  to  the  J-M  and  S-W 
models.  As  mentioned  earlier  the  asymptotic  covariance  matrix  for  the  estimators  in 
the  GPM  may  be  derived  in  a manner  similar  to  that  illustrated  in  4.4. 2 and  the 
computations  carrier  out  on  a computer. 


For  the  J-M  (a  = 1)  or  S-W  (a  = 2)  models,  6.  and  6 in  Section  4.4.2  may  be 
taken  to  be  N and^,  respectively  and 
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gj  (er  63)  = gj  (N,  4> ) = ♦(N  - M^)  T J ® 


o-  2 <elt  02)  = <r.2  (N,  4>  ) = 4>(N  - M^) 
i 


for  i = l,  2,  ...  , k,  ...  . Computing  the  necessary  partial  derivatives  leads  to 
(see  Section  4.4.2  for  notation  and  symbols) 


y 

*—3 


V (N-Mj^)2  r.2°  - J ^(N-Mi_i)Ti2° 
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- S'  4>  ( N - M . , ) r? 
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I)’  =(  y 4> 2 r.2°  y (N  - M._1)2  Ti2°-  ( ^ 4.(N  - M.^)  T. 
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i=l 
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i=I 


N - N \ / 0 \ 

So  that  | j is  asymptotically  normally  distributed  with  mean!  land 

i i - <t>  / 'O' 

convariance  matrix 


I 
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where 

tQ  = 0,  i = 1,  2,  , k. 


Taking  the  necessary  partial  derivatives  and  referring  to  Section  4.  4.2  for  notation 
and  symbols,  it  is  seen  that 
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Figure  4. 4. 2.1.  Simulation  of  the  Distribute 
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OBSERVE!  VALUE  OF  HERN  « 30 I. 5396 

OBSERVED  VALUE  OF  STANDARD  DEVIATION  = 17.6305 


CLASS  INTERVAL  TABLE 


Figure  4. 4. 2.1.  Continued 
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STATISTIC:  Least  squares 
estimate  of  N 

ACTUAL  VALUES  OF  PARAMETERS 
N » 300,  Phi  = 0.01 


82755-22 
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Figure  4. 4. 2.1.  Continued 


MODEL:  Schick 
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CLASS  INTERVAL  TABLE 
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ASYMPTOTIC  VAR(NHAT)  = 31.0411 

ASYMPTOTIC  VAR (PHIHA  T)  = 0.00000013 
ASYMPTOTIC  COV<NHAT,PHIHAT)  = -0.0018 

CHECK  VALUE  = -0.0018 


To  verify  these  results,  the  sampling  distribution  of  a and  b were  simulated  under 
the  same  conditions  described  in  Section  4.  4. 1, 2,  i.  e.  , true  a of  300,  true  b of  0.  01, 
etc.  The  results  are  shown  in  Figure  4.  4.  2.  2.  Note,  as  in  the  case  of  the  J-M  and 
S-W  models,  the  asymptotic  results  are  very  accurate  for  k = 25. 
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Figure  4.4. 2. 2.  Continued 
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AairtPToTIt  Va.v.AHhI.*  = i28.0fa?2 

ASYnFTOTIC  VAK.&HAT)  = 0.0000013? 
uStrtPTOTIC  CGVi AHrtT . BHAT > - '0.0113 

CHECK  VALUE  = -0.0113 


4.  4.  3 Small  Sample  Considerations 


Due  to  the  nature  of  the  models  considered  here,  the  typical  statistical  techniques 
used  in  deriving  small  sample  distributions  and  confidence  bounds  on  unknown  param- 
eters failed  and  further  investigation  yielded  no  significant  results.  Furthermore, 
one  may  be  tempted  to  use  the  asymptotic  results  to  derive  confidence  bounds  since 
they  were  shown  to  be  quite  good  for  k as  small  as  25.  However,  it  should  be  noted 
that  for  values  of  the  parameters  other  than  those  used  in  the  simulations  k = 25  may 
not  be  sufficient. 

4.  5 ESTIMATING  THE  TIME  RE  QUIRED  TO  REMOVE  A SPECIFIED  NUMBER  OF 

THE  REMAINING  ERRORS 

4.  5. 1 Introduction 

Each  of  the  models  contains  parameters  which,  in  order  to  use  the  models  for 
predicting  reliability  behavior,  must  be  estimated.  From  these  parameter  estimates 
other  more  important  parameters  can  be  estimated.  In  this  section  we  treat  the 
estimation  of  expected  time  to  remove  a specified  number  of  the  remaining  errors  for 
the  GPM,  J-M,  S-W,  binomial,  and  IBM  models.  It  will  be  assumed  that  M errors 
have  been  removed  "to  date"  and  that  each  remaining  error  is  removed  when  it  is 
observed.  That  is,  what  will  be  of  concern  are  the  times  between  successive  error 
finds  (removals)  denoted  by  SM+1,  SM+2  ....  For  the  GPM,  J-M,  S-W,  and  binomial 
models,  the  sequence  will  terminate  with  SN,  the  time  between  the  N-l  lst  and  N* 
errors  (by  definition,  the  N01  error  is  the  last  error  for  these  models). 

4.  5. 2 GPM,  J-M,  and  S-W  Models 


In  this  section  the  analysis  will  be  carried  out  for  the  GPM  model  since  the  J-M 
and  S-W  results  will  follow  byo=l  ando=2,  respectively,  in  the  results.  Under  the 
assumptions  of  Section  4.  5. 1,  the  expected  time  to  remove  ksN-M  of  the  remaining 
N-M  errors  is 


M+k 
= 2 


i=M+l 


, k = 1,  2,  ...  N-M. 


The  distributions  of  the  Sj's  are  easily  obtained  as  follows:  The  event  {Si^t} 
(for  t >0)  is  equivalent  to  the  event  {number  of  errors  in  time  interval  of  length  t 
beginning  at  the  time  when  the  i-l8t  error  occurred  is  zero}.  Since  this  interval  has 
length  t and  a total  of  i-1  errors  have  been  removed  un  to  the  time  of  the  i-18*  error 
find/removal,  it  is  seen  that 


PfSj  > t}  = 


[4>(N-  (i-1))  t°) 

or 


0 


-4>(N-  (i-1))  t° 

e 


- e 


-4>(N-i+l)t‘ 


t > 0. 
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The  probability  density  function  of  S.  is 


f (t)  = o4>(N-i+l)ta_1  e”^N_i+1^  , t >0; 

u, 

1 

a Weibull  distribution. 

The  expectation  of  S.  is 


CO 

E (S.)  — f tfs 

1 Jo  i 


(t)  dt 


JL  i _i 

= f<t>(N-i+i))  Q o r (o  x) 

(noting  that  !'(•)  is  the  usual  gamma  function)  and  hence, 


M+k  ! 

T =o_1r(o"1)  ^ r<MN-i  + l)l  ° 

k 

i=M+l 


Since  N,  <b,  a are  unknown  in  general,  two  estimates  of  T^,  T^  and  are  avail- 
by  replacing  N,  4>,  and  o by  their  MLEs  or  LSEs,  respectively.  For  example, 


able  by  replacing 

* “1  r*  / I 


M+k 


t,  = a * T ( a 1)  y f*(N  - i + 1)}  a . 
i=M+l 

It  should  be  pointed  out  that  when  estimates  are  used  for  N,  6,  a , k must  be  less 
than  or  equal  to  N-M  or  51- M,  depending  on  the  method  used  (MLE  or  LSE). 

4.  5. 3 Binomial  Models 

The  expression  for  the  expected  time  to  remove  k - N-M  of  the  remaining  N-M 
errors  is  , as  in  Section  4.5.2, 

M+k 

\ 


Tk  = ^ E(Si)  , k = 1,  2 N-M. 


i=M+l 


The  probability  density  function  of  Sj  is  derived  as  follows:  The  event  (for  t >0 
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S.  ?tl  is  equal  to  the  event 


{no.  errors  in  the  time  interval  of  length  t beginning  at  the  time  of  the  i-15 
error  find/removal  is  zero]  . Using  the  Binomial  model  (see  Section  3.  3 for 
otations). 


/ N-  (i- 1 ) \ 

\ o /p<t)  (1'r 


N-(i-l)  -0 


N-i  + 1 

= (l-P(t))  , t > 0. 

For  Binomial  I,  p(t)  = t/(t+c).  In  this  case, 

N-i+1 

PIS.  >t}=  { c/ (t+c) } , t > 0, 


and  the  probability  density  function  of  is 


fg  (t)  = {N-i  + 1}  { c/(t+c)  } c/(t+c/ 

i 


. ,1  (N-i+1)  / (N-i+2) 

= { N-  i + 1 ] c / (t+c)  , t > 0. 


The  expected  value  of  S.  is 


E(S.)  = / tfg  (t) 


o i 


= / x when  i = N 

1 (i.e.  the  last  error  is  never  found, 

I on  the  average) 


c (N-i)  for  i = M+l,  M+2,  ...,  M+k;  k = 1,  ....  N-M-l. 


For  the  Binomial  II,  p(t)  = 1-e  so  that 


?.2t]  = e -a(N"  1 + 1)1 


and  the  probability  density’  function  of  S.  is 


I 


fg  (t)  = a (N-i  + 1)  e a(N_i+1)t 


with 


-1 

E(S.)  = fa  (N-i  + 1)} 


Due  to  the  one  case  of  infinite  expection  for  the  Binomial  I model,  Tk  is  no: 
defined  when  k = N-M.  Thus,  for  the  Binomial  I model, 

MH-k 

Tk  = ^ c (N-i)  1 , k = 1,  2 N-M-l, 

i=M+l 

with  N,  c replaced  by  N,  c or  N,  c for  estimators.  For  the  Binomial  II  model. 

M+k 

Tk=  { a(N-  i + 1)}  , k=  1,  2,...,  N-M, 

i=M+l 

with  Tk  and  Tj.  defined  analogously  as  in  Section  4.  5.  2.  Note  the  similarity  hen  wri- 
the results  for  the  J-M  model  of  Section  4.  5.  2. 


4.5.4  IBM  Model 

For  the  IBM  model,  an  analysis  of  the  type  discussed  in  Sections  4.6.2  and  .63 
is  not  appropriate.  The  reason  for  this  is  best  explained  by  examining  the  distribution 
of  the  time  to  detect  the  next  error,  given  M prior  detections  in  time  t (this  is  equiv- 
alent to  The  event  : S^j+j?  h ] (h  >0)  is  equivalent  to  the  event  [no.  of  errors  in 

the  interval  t to  t+h  is  zero}.  Since  the  processes  N(t)  has  independent  increments. 

P f S > h 1 = e""(m(*+^)  ~ m(t)) 

1 M+l " n ' e 


so  that 


..  _ <■  _ . - 

iim  v t&M+1>n)  = e ' 

h — on 

and  hence,  S^+j  = ® with  probability  e ^ m(^))  and  therefore  has  infinite  expectation. 

This  probability  that  SM+1  = « can  be  made  arbitrarily  small  by  taking  t large  (i.  e. 
by  observing/ removing  errors  for  a longer  time).  However,  it  is  clear  that  an 
analysis  of  the  type  performed  earlier  will  fail. 

However-*  there  is  a rr:er>sure«f«the  time  U required  U>  row  wo,  on  tk?  average.-  ,< 
a specified  fraction  q of  the  expected  number  of  errors  ever  observable,  namely  a.  To 
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a*. 


derive  this  measure,  suppose  that  the  specified  fraction  is  q.  On  the  average,  then, 
the  number  of  errors  observable  after  time  t is  E (N(  « ) - N(t))  = a - a (l-e"b*)  = 
ae-bt.  Thus,  it  is  desired  that 


(i-q) 


0 < q < 1, 


or  tq  = -In  (1— q)  / b.  Since 

" A A <V 

b is  unknown,  estimates  of  tqare  tqand  fq  where  b is  replaced  by  b and  b,  respectively. 


Section  5.  0 


MODEL  VALIDATION  AND  DATA  ANALYSIS 


5.  I PARAMETER  ESTIMATES 


5.1.1  Model  Comparisons 


The  parameter  estimates  for  data  sets  1-16  and  for  the  various  model 
aie  given  in  Tables  5.1.1  - 5.  1. 16.  The  notations  and  abbreviations  used 
subject  tables  can  be  found  in  the  Glossary  given  in  Section  8.0. 


types 
in  the 


The  asterisks  used  in  the  tables  to  indicate  the  absence  of  an  estimate  require 
some  further  explanation.  A single  asterisk  indicates  that  after  reasonable 
diligence  with  varying  starting  points  we  could  not  obtain  convergence  of  the  numeri- 
cal method  to  a single  parameter  point.  The  cause  of  this  lack  of  convergence  is, 
in  addition  to  the  absence  of  a rational  method  of  selecting  starting  points,  probably 
the  failure  of  the  data  to  satisfy  an  assumption  of  the  model  - the  expected  number  of 

furrT«t.per  ^ time  is  a decreaaing  function  of  time.  The  reason  we  say  this  is  that 
the  IBM  model  never  had  a convergence  problem  and  it  did  not  assume  a decreasing 
mean  error  rate;  only  that  the  cumulated  failures  in  time  is  an  increasing  function 
of  time.  All  data  satisfy  that  assumption. 


I he  double  asterisk  indicates  a situation  in  which  the  numerical  method 
converged  to  a point  but  the  number  was  absurd.  For  example  N might  have  been 
100  when  the  number  of  errors  already  observed  was  200.  All  of  the  models 
suffered  from  this  problem  although  it  was  not  quite  as  prevalent  as  the  single 
asterisk.  In  this  connection  the  large  number  ot  negative  6 s and  5 s occurring  fui 
the  GPM  should  be  mentioned.  This  was  not  given  a double  asterisk  even  though  a 
negative  a means  the  expected  number  of  errors  decreases  as  the  interval  width 
Increases.  This  may  sound  absurd  but  it  violates  no  explicit  assumption.  Also  it 
is  interesting  to  note  that  the  negative  values  are  often,  indeed  most  often,  not  far 
from  zero  and  the  problem  could  have  been  caused  bv  the  point  previously  raised  - 
the  observed  mean  error  rate  per  unit  time  was  not  decreasing.  In  the  few  cases 
in  which  a pot*. live  5 or  a was  obtained  the  number  was  nearer  zero  than  one 
(J-M  model)  or  two  (S-VV  model)  lending  credence  to  the  belief  that,  at  least  for 
the  observed  data,  the  error  occurrence  rate  did  not  depend  on  the  interval  width, 
or  at  least  that  dependence  was  obscured  by  other  more  dominant  factors. 


TABLE  5. 1. 1.  DATA  SET  1 - PARAMETER  ESTIMATES 
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TABLE  5. 1. 2.  DATA  SET  2 - PARAMETER  ESTIMATES 


TABLE  5. 1. 3.  DATA  SET  3 - PARAMETER  ESTIMATES 


Data  Set 
Model 

4-fd 

4-fdG 

4-fx 

4-fxG 

4-ff 

4-ffG 

GPM 

N 

17208 

13398 

15672 

a 

♦ 

-.0199 

* 

★ 

-.1233 

-.0138 

4> 

. 0023 

.0023 

.0025 

J-M 

N 

4297 

4246 

4141 

4181 

4261 

4232 

$ 

.0094 

.0078 

.0073 

.0063 

.0092 

.0076 

S-VV 

N 

$ 

** 

* * 

** 

** 

** 

* 

GPM 

A 

N 

14928 

34956 

11010 

13621 

a 

* 

-.0389 

-.4831 

* 

-.1996 

-.0336 

* 

.0027 

9.29xl0“4 

.0029 

.0029 

J-M 

N 

4924 

4929 

* * 

** 

4453 

4453 

* 

.0087 

.0087 

.0100 

.0101 

S-VV 

N 

★ 

** 

♦ 

% 

♦ * 

♦ 

Binomial  N 

4346 

4308 

4212 

4223 

1 

c 

107 

128 

136 

158 

*** 

*** 

Binomial  N 

4322 

4277 

4198 

4204 

II 

a 

. 0094 

.0078 

.0073 

.0063 

*** 

*** 

IBM 

a 

3073 

2130 

2159 

1730 

b 

.4087 

.4670 

.4522 

.4660 

*** 

*** 

IBM 

a 

4967 

4965 

4229 

4230 

A 

b 

. 2587 

.2591 

.3363 

.3361 

*** 

*** 

No.  of 

Errors 

4 176 

4176 

4176 

4176 

4176 

4176 

No.  of 
Time 

Intervals 

161 

124 

149 

121 

161 

124 

TABLE  5. 1.  5.  DATA  SET  5 - PARAMETER  ESTIMATES 
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TABLE  5. 1.6.  DATA  SET  6 - PARAMETER  ESTIMATES 


Data  Set 
Model 


S-W 


G PM 


J-M 


S-W 


N 

A 

N 

A 

a 

* 

N 

N 

& 


Binomial  N 

I c 
Binomial  N 

II  a 
IBM  a 

b 

IBM  a 
b 

No.  of 
Errors 

No.  of 

Time 

Intervals 


6-fd 


919 
6.4x10 
636 
5.4x10 


-4 


-5 


594 

4.8x10 

873 

1440 

895 

. 0007 

458 

.0384 

2.75x10 

1.11x10 


406 


-4 


9 

-8 


134 


6-fdG 


1324 
6.99x10 
547 

7.62x10' 


-4 


875 

7.62x10 

1198 

1239 

1253 

.0008 

1218 

.0225 

1.49x10 

2.04x10 

406 

28 


5 

-4 


X No  fix  data  available 


I 


I 


I 


F 


; 

! 

f 


i 

I 


TABLE  5. 1. 10.  DATA  SET  10  - PARAMETER  ESTIMATES 


TABLE  5. 1. 11.  DATA  SET  11  - PARAMETER  ESTIMATES 


TABLE  5. 1. 13.  DATA  SET  13  - PARAMETER  ESTIMATES 


f 


■ 

I 

[ 
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Data  Set 
Model 

13-fd 

13-fdG 

13-fx 

13-fxG 

13-ff 

G PM  N 

2778 

115405 

2895 

5 

-.0544 

★ 

* 

.0975 

-.0457 

$ 

.0024 

2.  38xl0"4 

.0023 

J-M  N 

$ 

** 

** 

♦ 

** 

** 

S-W  N 

$ 

** 

** 

* 

** 

** 

GPM  N 

2497 

211922 

45335 

2631 

a 

-.0149 

-.0740 

* 

.1108 

-.0058 

.0028 

7. 59xI0"5 

5. 94xl0"4 

.0026 

J-M  N 

1604 

1912 

1804 

i 

** 

.0035 

* 

.0024 

.0026 

S-W  N 

i 

* 

** 

* 

** 

** 

Binomial  N 

1 c 

** 

** 

* 

** 

*** 

Binomial  N 

II  a 

** 

** 

♦ 

** 

*** 

IBM  a 

764 

876 

937 

1005 

b 

.1552 

.1263 

. 1084 

.1139 

**  * 

IBM  a 

1612 

1611 

1947 

1938 

b 

. 1036 

. 1042 

.0705 

.0719 

*** 

No.  of 
Errors 

1572 

1572 

1789 

1789 

1572 

No.  of 

Time 

Intervals 

360 

110 

101 

52 

360 

TABLE  5.  1.  14.  DATA  SET  14  - PARAMETER  ESTIMATES 
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TABLE  5. 1. 15.  DATA  SET  15  - PARAMETER  ESTIMATES 


Interval 


TABLE  5. 1. 16.  DATA  SET  16  - PARAMETER  ESTIMATES 


Data  Set 


Binomial  N 

I c 
Binomial  N 

II  a 
IBM  a 


No.  of 
Errors 

No.  of 

Time 

Intervals 


16-fd 

16-fdG 

16-fx 

16-fxG 

16-ff 

16-ffG 

2572 

5181 

5368 

13385 

. 1433 

.0476 

-. 1691 

-.0317 

* 

** 

-4 

.0126 

.0093 

.000, 

9.78x10 

** 

28191 

1596 

1515 

** 

** 

-5 

2.6x10 

.0011 

.0015 

** 

** 

1366 

** 

3607 

** 

-6 

.0001 

2.03x10 

1703 

3790 

4820 

13535 

. 1239 

.0459 

-. 1698 

-.0296 

★ 

* 

. 0253 

.0134 

.0007 

9.63xl0"4 

3698 

3813 

1796 

1793 

* 

-4 

-4 

5.49x10 

5.29x10 

.0017 

.0017 

* 

* 

** 

♦ 

4710 

13162 

-6 

-6 

8.92x10  0 

1.99x10 

6908 

1600 

1550 

♦ 

*** 

*** 

8672 

706 

678 

1598 

1533 

* 

** 

*** 

*** 

.0011 

.0015 

1.22.X108 

1.03xl08 

859 

943 

*** 

-7 

-7 

2.14x10 

2.10x10 

.0617 

.0733 

13826 

13064 

1804 

1799 

*** 

*** 

.0038 

.0040 

.04  98 

.0501 

1333 

1333 

1332 

1332 

1333 

1333 

42 

28 



458 

113 

42 

. 

The  three  asterisks  indicate  a situation  in  which  the  model  was  inappropriate. 

For  example  the  three  asterisks  invariably  present  in  the  ff  and  ffG  columns  indicate 
that  IBM  model  can  only  accommodate  one  date  or  time,  either  find  or  fix,  but  not 
both  (as  ff  and  ffG  require). 

The  IBM  model  appears  to  yield  parameter  estimates  most  otten.  The  few  cases 
of  double  asterisks  for  the  IBM  model  were  due  to  convergence  to  negative  estimates 
of  parameters  (a,  b)  which,  a priori,  must  be  positive.  For  example,  a negative 
a or  a (in  the  face  of  a positive  estimate  of  b)  would  imply  a negative  mean  number 
of  occurrences  per  arbitrary  time  interval,  an  absurdity.  In  short  the  estimates 
lie  outside  the  parameter  space.  There  are  other,  seeming,  absurdities  about  the 
IBM  model.  For  example  in  data  set  1 an  estimate  of  a on  the  order  of  10^®  seems 
high  when  only  2035  errors  have  been  observed.  This  is  probably  an  indication  of 
poor  fit.  Another  problem  with  the  IBM  model  is  illustrated  in  data  set  2 in  which  the 
LSE  of  a,  namely  a,  is  less  than  the  number  of  errors  observed.  Now  a is  the 
limiting  (t  — • ® ) value  of  the  mean  number  of  occurrences.  The  fact  that  a is  on  the 
order  of  5000  when  the  observed  number  of  errors  is  7819  is  not,  [Ter  se,  absurd.  But 
if  the  mean  number  of  occurrences  is  5000  (t  — - ®),  to  observe  7819  is  extremely 
unlikely.  Here  again  the  IBM  model  is  probably  just  a poor  fit.  In  spite  of  these 
problems  the  IBM  model  behaves  well  overall. 

Even  though  the  MLEs  are  not  available  for  the  Binomial  model  the  LSEs  for 
the  parameter  N,  the  most  important  parameter,  behave  very  nicely  and  always 
seem  to  be  reasonable  estimates. 

As  previously  mentioned  the  J-M  and  S-\V  models  are  special  cases  (o  = 1, 
o~2  respectively)  of  the  GPM  model.  One  of  the  disturbing  results  of  the  data 
analysis  is  that,  by  and  large,  the  estimate  of  N is  somewhat  sensitive  to  which 
model  is  used. 

One  feature  that  is  true  of  all  the  models  is  clear  enough:  the  parameter 
estimates  remain  pretty  much  the  same  irrespective  of  whether  the  data  is  grouped 
(>  ten  per  interval)  or  not.  To  some  extent  it  is  even  immaterial,  to  the  parameter 
estimates,  whether  the  data  was  find  or  fix. 


5. 1.2  Comparison  of  MLEs  and  LSEs 

There  are  many  properties  (consistency,  unbiasedness,  mean  square  error, 
variability)  with  which  to  compare  estimates;  often  which  properties  are  chosen  for 
comparison  is  a matter  of  personal  taste  or  is  dictated  by  the  situation. 

In  many  respects  a "must"  property  is  consistency  (the  estimate  converges  to 
the  parameter  estimated  in  probability  as  the  "sample  size"  gets  large).  Both  the 
LSE  and  MLE  are  consistent.  It  is  almost  universally  agreed  that  a proper 
measure  of  comparison  of  estimators  invites  some  measure  of  precision.  We  give, 
in  Tables  5.  1. 2. 1,  5. 1 . 2.  2 and  5. 1 . 2.  3 comparisons  based  on  the  standard 
deviation  (std.  dev. ) of  the  estimators.  The  observed  std.  dev.  is  taken  from  the 
simulations  discussed  in  Section  4.4.  The  asymptotic  std.  dev.  was  also  discussed 
in  Section  4.  4. 
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TABLE  5.  1.2. 1.  COMPARISON  OF  MLEs  AND  LSEs  FOR  THE  J-M 
AND  S-W  MODELS  FOR  ESTIMATING  N 


J-M 

S-W 

N 

R 1 

N 

N 

Observed  Std.  Dev. 

Hi.  3 

17.  r, 

4.99 

5.  65 

Asymptotic  Std.  Dev. 

l.  15-6 

I7.r, 

4.  94 

5.  57 

1 

TABLE  5.  1.2.2.  COMPARISON  OF  MLEs  AND  LSEs  FOR  THE  J-M 
AND  S-W  MODELS  FOR  ESTIMATING  <t> 


- 1 

I 

S-W 

4> 

$ 

5 

* 

Observed  Std.  Dev. 

0.0010 

0.0011 

' 0.0003 

0.0004 

Asymptotic  Std.  Dev. 

0.0010 

1 

0.  001 1 

0.0003 

1 0. 0004 

TABLE  5.  1.2.3.  COMPARISON  OF  MLEs  AND  LSEs  FOR  THE  IBM 
MODEL  FOR  ESTIMATING  a 


a 

. 

a 

Observed  Std. 

Dev. 

20.  4 

21.3 

Asymptotic  Std 

1.  Dev. 

19.2 

20.  7 

From  the  first  two  tables  (5.  1.2. 1 and  5.  1. 2.2)  it  is  obvious  that  for  the  J-M 
and  S-W  models  the  MLE  is  superior  to  the  LSE  for  both  N and  d>,  because  the 
observed  and  asymptotic  std.  dev.  's  are  always  smaller  for  the  MLE.  However, 
speaking  relatively  the  superiority  of  the  MLE  over  the  LSE  is  not  large  enough  to 
be  overridden  by  other  considerations;  for  example  if  convergence  cannot  be 
obtained  for  the  MLE  the  LSE  is  perfectly  satisfactory.  The  same  kind  of  results 
are  also  true  for  Table  5.1.2.  3 which  considers  the  IBM  model. 

Another  point  is  worth  noting  from  the  first  two  tables:  the  std.  dev.  's  are 
less  (and  sometimes  a good  deal  less)  for  the  S-W  model  than  for  the  J-M  model. 
This  does  not  mean  the  S-W  model  is  superior  to  the  J-M;  after  all  the  decision 
of  which  model,  if  either  fits,  must  be  made  on  goodness  of  fit  grounds.  The 
differing  std.  dev.  's  for  the  two  models  merely  reflects  the  dependence  of  the 
std.  dev.  on  a . 


I 

* 
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Tables  5. 1. 2.4  and  5. 1. 2.  5 give  a feeling  for  the  closeness  of  the  MLE  and  LSEs 
for  selected  (so  that  estimates  existed)  data  sets  for  fxG  data.  The  fxG  data  was 
selected  because  we  felt  it  led  to  the  best  estimates.  Mostly,  the  comparable 
values  are,  as  expected  from  the  std.  dev.  results  previously  given,  close.  How- 
ever there  is  another  point,  raised  ever  so  slightly  bv  Table  5. 1 . 2.4:  the  MLEs 
(hats)  seemed  to  experience  slightly  more  convergence  (single  asterisk)  problems 
than  the  LSEs  (tildes).  In  the  three  models  (GPM,  J-M,  S-VV)  for  which  they  were 
both  used  there  were,  over  the  Ifi  data  sets  and  over  the  data  types  (fd,  fdG,  etc. ), 

8(>  c ases  of  single  asterisk  for  the  MLE  and  only  35)  cases  for  the  LSE.  However 
if  the  double  asterisk  situation  is  included  the  two  methods  of  estimation  are  more 
comparable. 


5.  1 . .1  Estimation  of  N 

All  of  the  models  have  parameters  (parameter  values)  peculiar  to  themselves. 

For  example  the  J-M  has  o 1 and  the  S-W  has  o = 2,  while  the  IBM  model  has  no 
o at  all.  The  universal  and  important  parameter  is  N (a  in  the  IBM  model),  for  it  is  the 
unknown  number  of  errors  present  in  the  S W.  Indeed  having  estimated  N with, 
sa\  S,  then  the  ven  important  quantity,  (unknown)  number  of  errors  remaining  in 
the  S/W,  can  be  estimated  as  N - number  of  errors  already  removed. 

Table  5. 1.3. 1 gives,  for  fxG  data,  the  various  estimates  of  N for  the  various 
models.  The  IBM  model  was  not  included  because  it  really  does  not  estimate  N since 
the  parameter  a is  the  expected  (mean)  number  of  errors  as  t - ® . Only  the  MLEs 
are  shown  in  the  table.  Actually  the  J-M,  S-W  and  Binomial  models  give  fairly 
close  results  while  the  GPM  differs  a good  deai.  It  remains  clear  that  the  choice  of  a 
model  must  be  made  on  previous  experience  with  the  model  (how  well  it  fits),  not  in 
answers  one  is  looking  for. 


3.2  GOODNESS-OF- FIT  DEVELOPMENT 


To  test  the  goodncss-of-fit  for  the  S/W  reliability  models  the  following  statistic 
was  employed: 


X 


2 


k 

f 

i=  1 


(OBS.  - EXPECTED.)2 
EXPECTED. 

i 


where 


OBS. 


| N.  for  the  GPM,  J-M,  S-W,  and  binomial  models 
| Z.  for  the  IBM  model 
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TABLE  5. 1.2.4.  LSE  AND  MLE  PARAMETER  ESTIMATES 
FOR  POISSON  MODELS 


r 

i| 


N 

N 

4> 

— 

4> 

a 

a 

DATA  SET 

1-fxG 

7483 

9389 

- 

.0065 

.0051 

-.3543 

-. 3654 

2-fxG 

20447 

20491 

O 

O 

»—* 

o 

o 

-.0742 

-.0755 

5-fxG 

2929 

3011 

.0051 

.0050 

-.0725 

-.0711 

GPM  7-fxG 

7569 

7341 

.0059 

.0057 

.2481 

.2934 

13-fxG 

115405 

45335 

2.  38xl0“4 

4 

5.94x10  4 

.0975 

.1108 

14-fxG 

3972 

2727 

.0223 

.0251 

-.2314 

-.1200 

16-fxG 

13385 

13535 

-4 

9.78x10 

-4 

9.63x10 

-.0317 

-.0296 

1-fxG 

17059 

♦ 

5.  56x1 0“6 

★ 

2-fxG 

8004 

8170 

.0018 

. 0027 

4-fxG 

4181 

♦ 

. 0063 

♦ 

J-M  5-fxG 

650 

508 

6. 39xl0'4 

.0052 

7-fxG 

10820 

7434 

6. 19xl0'4 

.0014 

13-fxG 

♦ 

1912 

* 

.0024 

14-fxG 

793 

763 

.0043 

.0056 

16-fxG 

1515 

1793 

.0015 

.0017 

1-fxG 

3190 

♦ 

5.98xl0"6 

* 

s-vv  5-fxG 

7-fxG 

495 

13641 

♦ 

9278 

1.26xl0"5 

1.82xl0‘5 

* 

1 . 03x1 0~4 

14-fxG 

633 

593 

2.  07xl0~4 

-4 

4.36x10 
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TABLE  5.  1.2.5.  LSE  AND  MLE  PARAMETER  ESTIMATES 
FOR  THE  IBM  MODEL 


DATA  SET 

r * 

a 

b 

f> 

1-fxG 

1. 27x1 0'S 

7.27X1010 

2. 12x1 0”7 

1.22xl0~9 

2-fxG 

4997 

8181 

.0837 

.0804 

4-fxG 

1730 

4230 

.4660 

. 3361 

5-fxC. 

128 

515 

. 1125 

. 1543 

7-fxG 

9276 

8426 

. 0209 

.0356 

13-fxG 

1005 

1938 

.1139 

.0719 

1 4-fxG 

656 

842 

.1577 

.1511 

16-fxG 

943 

1799 

.0733 

. 0501 

TABLE  5.  1.3,  1.  COMPARISON  OF  MODELS  WITH  RESPECT 
TO  ESTIMATION  OF  N 


DATA  SET 

' 

C.  PM 

7m 

Iw 

Bin  I 

Bin  II 

1 -fxG 

7483 

17059 

3190 

13631 

* 

2-fxG 

20447 

8004 

* 

8025 

8014 

4-fxG 

* 

4181 

* 

4223 

4204 

5-fxG 

2929 

650 

495 

651 

650 

7-fxG 

7569 

10820 

13641 

9642 

6055 

13-fxG 

115405 

* 

* 

* 

♦ 

1 4-fxC. 

3972 

793 

633 

763 

772 

16-fxG 

13385 

1515 

* 

1550 

1533 

4>  (N  - M.  )t.&  for  the  GPM,  J-M  (o  - I)  and 
1 1 S-W  (o=2)  models 


i-1 


EXPECTED. 


P(t. ) (N  - N N.)  for  the  binomial  models  (p(r.) 
j=l 


T.  /(  T.  +C) 

I I 


or  p(  t ) = 1 - e aT* ) 
i 


a(e 


-btj-i  _ -btj 


e 1 ) for  the  IBM  model. 


Since  the  parameters  N,  <t> , a , a,  b,  etc.  are  unknown,  they  were  replaced  by  their 
MLEs  and  by  their  LSEs.  For  the  GPM,  J-M,  S-W  and  IBM  models  the  distribution 
of  X2  is  approximately  a chi-square  distribution  with  k degrees  of  freedom  when  the 
parameters  are  known.  This  is  true  because  each  term  in  the  sum  is  the  square  of 
a normalized  Poisson  random  variable  and  the  terms  are  independent.  When  the 
mean  of  a Poisson  is  10  or  greater,  it  is  approximately  normally  distributed  so 
that  X 2 is  approximately  the  sum  of  the  squares  of  k indepondent,  standard  normal 
random  variables  (this,  of  course,  is  one  way  of  defining  the  chi-square  distribution; 
i.  e. , the  distribution  of  the  sum  of  squares  of  independent  standard  normal  random 
variables).  When  the  parameters  are  replaced  by  their  MLEs  or  LSEs,  X-  has 
approximately  a chi-square  distribution  with  degrees  of  freedom  equal  to  k minus 
1 degree  of  freedom  for  each  parameter  estimated  (i . e . , k-3  for  GPM,  k-2  for 
J-M,  S-W  and  IBM  models).  For  the  binomial  models,  whenever  N is  large  and 
p(Tj)  small,  the  chi-square  approximation  should  be  valid  with  k-2  degrees  of 
freedom  when  the  parameters  are  replaced  by  their  MLEs  or  LSEs. 

To  verify  the  chi-square  approximation  to  the  distribution  of  X2  when  parameters 
are  estimated  the  distribution  ofX2  was  simulated  using  k 25,  <b  = 0.01,  N = 300, 

t,  = 10,  tj  = lOi  (i — 1,  2 25),  a = 300,  b 0.01  for  the  J-M,  S-W,  and  IBM 

models,  respectively,  for  both  the  cases  where  the  parameter  values  were  replaced 
by  their  MLEs  and  LSEs.  The  simulated  distributions  are  based  on  500  independent 
observations  of  X2  and  the  mean  and  standard  deviation  are  printed  after  the  class 
interval  tabulations.  The  hypothesized  means  and  standard  deviations  are  23  and 
6.  782,  respectively  (the  mean  of  a chi-square  with  df  degrees  of  freedom  is  df  and  the 
standard  deviation  is\/2df).  Due  to  sampling  error,  the  observed  means,  standard 
deviations,  and  quantities  of  the  distributions  will  not  coincide  exactly  with  the  true 
means,  standard  deviations,  and  quantities.  However,  close  agreement  is  observed. 
(These  results  are  shown  in  Figure  5.2. 1). 


As  mentioned  in  Section  3.  2 there  is  an  advantage  to  utilizing  the  increments 
Zi  rather  than  the  cumulative  observations  N (tj)  in  the  IBM  model  when  testing 
goodness  of  fit.  In  fact,  it  will  be  shown  here  that  if  the  cumulative  observations 
are  used  in  the  X2  statistic  (assuming  a and  b known)  the  distribution  is  not  a 
chi-square  distribution  but  rather  has  the  distribution  corresponding  to  that  of  k 
times  a random  variable  with  a chi-square  distribution  with  one  degree  of  freedom 
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MtTIlOD  OF  PARAMLTLR  LSTIKATION: 
Maximum  likelihood 


Figure  5.2.1.  Continued 


MODEL:  Jelinski-Moranda 

MLT110D  OF  PARAMETER  ESTIMATION: 
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Continued 
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Figure  5.2.1.  Continued 


Figure  S.2.1.  Continued 


Figure  5.2.1.  Continued 


. r"~~r 


« 
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(This  can  be  interpreted  as  only  utilizing  one  data  point  and  this  reflects  the 
built-in  stability  of  cumulative  observation  models).  To  show  this  consider 


k 

s = V 
i 1 


(N(t.)  - m(t. )) 


m(t.) 


where  m(tj)  a(i-e  *Jt')  (i.e.,  this  is  the  X-  statistic  utilizing  the  cumulative 
observation  rather  than  the  increments  Z,).  Considers*  defined  by 

Ji  (N(t.)  - m(t  ))2 

S*  — S > 5 : — 

k k m(t.) 

i- 1 1 


For  m(tj)  > 10  for  each  i N(tj)  has  an  approximate  normal  distribution  with  mean 
m(ti)  and  variance  m(tj)  so  that  the  vector 

\ 

V 


v 


I N(tj)  - m(t, ) 
Jk  m(tj) 


N ftj, ) - m(t^,) 
\ 7k  m(tk> 


has  an  approximate  k-variate  normal  distribution  with  mean  equal  to  the  zero 
vector  and  covariance  matrix  (sec  Section  3.2)  equal  to 


J 

k 


/ 1 


\ 1 


. . 1 


1 

/ 


i.e.,-^  times  the  kxk  matrix  of  ones 
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It  is  well  known  that  (c.f.  (1,3)  p.  188)  if  Y has  a multivariate  normal  distribution 
with  mean  zero  and  covariance  matrix  £ then  a necessary  and  sufficient  condition 
for  YtAY  (T  denotes  transpose,  A is  a kxk  matrix)  to  have  a chi-square  distribution 
is  that 


£A£A£  = £A£ 

in  which  case  the  degrees  of  freedom  is  equal  to  the  rank  of  A£.  If  we  take 

1 

« 

1 

A = I = kxk  identity  matrix,  then 


/ 1 . . . 
A ‘ 

k 

\ 1 . . . 


T T 

S*  = Y AY  = Y Y 

and  the  necessary  and  sufficient  condition  is  satisfied  since 
£A£A£  = £3  = £A£  = £2. 

The  rank  of  A £ = £ is  clearly  1.  Hence,  S*  has  a chi-square  distribution  with 
1 degree  of  freedom,  i.e.  S = kS*  has  the  distribution  described  earlier.  The 
probability  density  function  of  S*  is 


fg*  (s*) 


s'*  >•  0 


and  the  density  of  S is 


fg  (s)  = e S ^ J J 2 rr  ks  , s >0. 

The  mean  of  S is  k and  the  standard  deviation  is  J 2k^.  To  verify  these  findings, 
the  distribution  of  S was  simulated  for  k = 25  and  is  shown  in  Figure  5.2.2.  Note 
the  long  tail  and  asymptotic  behavior  near  zero.  The  hypothetical  mean  is  k = 25 
and  the  variance  is  J 2k*  = J 2(25)2  = 35.355. 

Using  the  results  described  in  this  section  che  goodness  of  fit  of  the  S/W  relia- 
bility models  can  be  tested  by  computing  the  statistic  and  comparing  its  computed 
value  to  selected  quantities  of  the  chi-square  distribution  with  appropropriate  degrees 
of  freedom.  If  a value  of  is  observed  larger  than  say  the  a quantile  (a  = 0. 90, 

0. 95,  etc.)  of  the  appropriate  chi-square  distribution  then  the  hypothesis  that  the  data 
actually  were  generated  by  the  process  in  question  is  rejected  at  the  (1  - a)  signifi- 
cance level. 
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MODEL:  IBM 

STATISTIC:  Chi-square 


‘ (N  (T  (I)  ) -M(T  ( I)  ) I 1 
* M(T( I) ) 
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1540  • 
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1300  • 

1235  » 

1170  < 

1105  • 

1040  * 

973  • 

910  • 


845 

• 

• 

780 

• 

• 

715 

• 

• 

430 

• 

• 

585 

• 

• 

520 

• 

* 

433 

• 

• 

390 

• 

t 

• 

325 

• 

* 

• 

240 

• 

• 

• 

195 

• 

t 

• 

• 

130 

t 

1 

t 

• 

• 

45 

• 

• 

• 

t 

• 
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Figure  5.2.2.  Continued 


5.  3 GOODNESS-OF-FIT  RESULTS 

It  is  desired  to  test  the  hypothesis  that  for  a given  model,  the  data  under 
consideration  were  actually  generated  by  a process  governed  by  that  model.  This 
hypothesis  will  be  denoted  by  H0.  The  alternative  hypothesis  will  simply  be  that  Ho 
is  false. 

In  Section  5.  2 it  was  shown  that  when  H0  is  true,  the  statistic  X2  discussed 
there  has  a certain  chi-square  distribution  and  can  thus  be  used  to  test  H0.  The 
hypothesis  H0  is  rejected  if  a significantly  large  value  of  X2  is  observed.  By 
"significantly  large"  it  is  meant  that  the  observed  X2  is  larger  than  a preselected 
/3-quantile  of  the  appropriate  chi-square  distribution  (0<  /3  < 1).  The  value  1-/3 
is  called  the  significance  level. 

Tables  5.  3. 1 through  5.  3. 16  give  the  X2  values  for  testing  goodness  of  fit. 

Below  each  X2  value  is  a normalized  X2  value  defined  by 

( X 2 - df)/  J 2 df 

which  for  large  df  has  a N(0,  1)  distribution  (N(0, 1)  is  a normal  distribution  with 
mean  zero  and  variance  one)  where  df  is  the  degrees  of  freedom  of  the  chi-square 
distribution  which  X2  has  when  H0  is  true  (df  = the  number  of  debugging-time 
intervals  less  the  number  of  parameters  estimated  from  the  data).  Whenever  df 
is  so  large  (greater  than  or  equal  to  100)  that  chi-square  tables  are  not  available 
the  normalized  X2  value  can  be  compared  to  the  /3-quantile  of  the  N(0, 1)  distribution, 
since  when  H0  is  true  and  df  large,  the  normalized  X2  value  is  distributed  as 
N(0,  1),  approximately. 

To  illustrate  these  concepts  consider  Table  5.3.15  referring  to  the  GPM  (LSE) 
fit  to  the  15C-fdG  data.  The  X2  value  is  28  with  degrees  of  freedom  df  = 15  (see 
bottom  of  table  for  number  of  time  intervals).  The  .975  quantile  of  the  chi-square 
distribution  with  15  degrees  of  freedom  is  27.  5 so  that  the  X2  value  of  28  is  significant 
at  the  .025  significance  level  and  H0  would  be  rejected  at  this  level.  The  .99 
quantile  of  the  chi-square  distribution  with  15  degrees  of  freedom  is  30.6  so  that  the 
X2  value  of  28  is  not  significant  at  the  .01  significance  level. 

The  significance  level  represents  the  probability  of  rejecting  H0  when  it  is  in 
fact  true  and  thus  must  be  determined  subjectively. 

On  the  basis  of  Tables  5.3.1  through  5.3.16  it  is  seen  that  HQ  would  be  -ejected 
in  almost  every  case  at  any  reasonable  significance  level.  However,  some  exceptional 
cases  are  worth  pointing  out. 

The  GPM,  J-M  and  S-W  models  appeared  to  be  the  models  which  achieved  the 
best  fits.  Table  5.3. 17  summarizes  the  best  fits  achieved  by  these  models.  The 
majority  of  these  "good"  fits  were  for  the  GPM.  This  is  not  surprising  since  the 
GPM  has  three  parameters  and  hence  extra  freedom  to  fit.  The  data  represented 
in  Table  5.3.17  contain  data  from  a command  and  control  system  (5),  a shipboard 
radar  (14),  a sonar  system  (15)  and  a ground  fixed  radar  (16).  No  other  models 
exhibited  wider  applicability  than  the  GPM,  J-M,  and  S-W  models. 
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TABLE  5.  3. 1.  CHI-SQUARE  VALUES  FOR  DATA  SET  1 


Data  Set 

Model 

1-fd 

0 

1 

L 

1 — fx 

1-fxG 

i-ff 

1-ffG 

GPM 

(LSE) 

* 

* 

* 

1356 

* 

103 

J-M 

(LSE) 

* 

* 

86622 

52441 

* 

4002 

4192 

s-vv 

(LSE) 

* 

* 

2.86xl06 

2. 95x1 06 

* 

1 32324 

236132 

GPM 

(MLE) 

* 

* 

* 

1360 

* 

* 

103 

J-M 

(MLE) 

♦ 

* 

* 

* 

* 

* 

s-vv 

(MLE) 

* 

♦ 

* 

♦ 

* 

* 

Binomial 

(LSE) 

* 

* 

* 

52390 

* * * 

*** 

I 

4188 

Binomial 

(LSE) 

* 

* 

* 

* 

*** 

* * * 

II 

IBM 

(LSE) 

20235 

12189 

95470 

53967 

*** 

*** 

748 

718 

4402 

4287 

IBM 

(MLE) 

♦ 

3082 

25873 

15351 

* * * 

170 

1188 

1223 

No.  of 

Time 

Intervals 

354 

142 

235 

80 

1 

354 

; 

142 

L 

tm 


? , 

i * 


TABLE  5.  3.  2.  CHI-SQUARE  VALUES  FOR  DATA  SET  2 


Data  Set 

Model 

2-fx 

2-fxG 

GPM 

(LSE) 

* 

1392 

33 

J-M 

(LSE) 

15500 

10698 

360 

351 

S-W 

(LSE) 

* 

* 

GPM 

(MLE) 

13216 

1392 

304 

33 

J-M 

(MLE) 

7524 

5044 

164 

158 

S-W 

(MLE) 

* 

* 

Binomial 

(LSE) 

15040 

10172 

I 

348 

333 

Binomial 

(LSE) 

15402 

10424 

II 

357 

342 

IBM 

(LSE) 

13041 

10741 

299 

352 

IBM 

(MLE) 

7779 

5314 

170 

167 

No. 

Time 

Intervals 

834 

\ 

430 

I 


TABLE  5.  3. 4.  CHI-SQUARE  VALUES  FOR  DATA  SET  4 


TABLE  5.  3.  6.  CHI-SQU ARE  VALUES  FOR  DATA  SET  8 


Data  Set 

Model 

6-fd 

6-f 

GPM 

(LSI-;) 

* 

* 

J-M 

(LSE) 

5484 

112 

329 

153 

S-W 

(LSE) 

104828 

375 

6444 

519 

GPM 

(MLE) 

* 

* 

J-M 

(MLE) 

* 

* 

S-W 

(MLE) 

12863 

172 

784 

239 

Binomial 

(LSE) 

5455 

110 

I 

328 

150 

Binomial 

(LSE) 

54  70 

112 

II 

329 

152 

IBM 

(LSE) 

5340 

945 

319 

125 

IBM 

(MLE) 

2110 

C76 

122 

90 

No. 

Time 

Intervals 

134 

28 

' 


TABLE  5.  3.  7.  CHI-SQUARE  VALUES  FOR  DATA  SET  7 


~1 


Data  Set 
Model 

7 -Id 

7-fdG 

7-fx 

7-fxG 

7-ff 

7-ffG 

GPM 

(LSE) 

1124 

376 

1 1 260 

3071 

393 

36 

10 

719 

260 

11 

J-M 

(LSE) 

1404 

1065 

38806 

6075 

* 

48 

45 

2487 

515 

S-W 

(LSE) 

3737 

8662 

536298 

76599 

. 

146 

438 

34467 

6562 

GPM 

(MLE) 

1124 

376 

9952 

3055 

★ 

393 

36 

10 

635 

258 

11 

J-M 

(MLE) 

1301 

806 

22821 

4157 

43 

32 

1459 

351 

s-w 

(MLE) 

1823 

3081 

136024 

19663 

G5 

150 

8736 

1680 

Binomial 

(LSE) 

1 399 

1042 

38460 

6030 

* * * 

♦ ** 

I 

47 

44 

2464 

511 

Binomial 

(LSE) 

1402 

1053 

38634 

6054 

* + + 

**  * 

11 

47 

45 

2476 

513 

IBM 

(LSE) 

1479 

1115 

33210 

6234 

*** 

**  * 

51 

48 

2118 

525 

TABLE  5.3.10.  CHI-SQUARE 
VALUES  FOB  DATA  SET  10 


TABLE  5.3.11.  CHI-SQUAHE 
VALUES  FOR  DATA  SET  11 


Data  Set 
Model 

10-fd 

Data  Set 

Model 

11 -fd 

G PM 

(LSE)  * 

GPM 

(LSE) 

* 

J-M 

(LSE)  1 503 

J-M 

(LSE) 

520 

96 

99 

S-VV 

(LSE)  i 503 

S-W 

(LSE) 

520 

96 

1 

99 

GPM 

(MLE)  * 

G PM 

(MLE) 

* 

J-M 

(MLE)  180 

J-M 

(MLE) 

491 

92 

| 

94 

S-W 

(MLE)  480 

S-W 

(MLE) 

491 

92 

94 

Binomial 

(LSE)  503 

Binomial 

(LSE) 

520 

I 

96 

99 

Binomial 

(LSE)  503 

Binomial 

(LSE) 

520 

11 

96 

II 

99 

IBM 

(LSE)  564 

IBM 

(LSE) 

585 

104 

108 

IBM 

(MLE)  519 

IBM 

(MLE) 

530 

99 

101 

No. 

1 

No. 

Time 

• 

Time 

Intervals 

15 

Inte  rvals 

15 

TABLE  5.3.12.  CHI-SQUARE 
VALUES  FOR  DATA  SET  12 


Data  Set 

Model 

12-fd 

12-fdG 

GUM 

(LSE) 

♦ 

* 

J-M 

(LSE) 

♦ 

11 

7 

S-VV 

(LSE) 

* 

* 

G PM 

(MLE) 

* 

* 

J-M 

(MLE) 

72 

8 

7 

5 

S-W 

(MLE) 

* 

* 

Binomial 

(LSE) 

* 

5 

I 

3 

Binomial 

(LSE) 

7 

II 

4 

IBM 

(LSE) 

* 

* 

IBM 

(MLE) 

72 

10 

7 

6 

No. 

Time 

Intervals 

26 



3 
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TABLE  5.  3. 13.  CHI-SQUARE  VALUES  FOR  DATA  SET  13 


Data  Set 

Model 

13-fd 

1 3-fdG 

13-fx 

13-fxG 

_ 

13-ff 

C.  PM 

(LSE) 

1992 

* 

487 

1975 

61 

44 

61 

J-M 

(LSE) 

* 

l 

* 

* 

* 

* 

S-VV 

(LSE) 

* 

* 

* 

* 

* 

GPM 

(MLE) 

1994 

191 

* 

487 

1975 

61 

6 

44 

61 

J-M 

(MLE) 

2209 

5981 

3003 

* 

143 

418 

295 

S-W 

IMLE) 

* 

* 

* 

* 

♦ 

Binomial 

(LSE) 

3175 

* 

* 

*** 

I 

iso 

Binomial 

(LSE) 

* 

* 

* 

* 

* * * 

II 

IBM 

(LSE) 

14336 

5133 

13022 

6742 

522 

340 

91  1 

663 

IBM 

(MLE) 

4140 

2331 

5551 

2681 

141 

151 

387 

263 

No. 

Time 

Intervals 

360 

110 

101 

52 

360 

TABLE  5.  3.  14.  CHI-SQUARE  VALUES  FOR  DATA  SET  14 


Data  Set 

Model 

14  -fd 

14-fdG 

14-fx 

14-fxG 

14-ff 

14-: 

GPM 

(LSE) 

716 

483 

118 

711 

* 

41 

82 

27 

41 

J-M 

(LSE) 

1691) 

891 

5595 

5S5 

1843 

951 

108 

104 

957 

136 

117 

111 

S-\V 

(LSE) 

* 

* 

* 

3641 

* 

* 

856 

GUM 

(MLE) 

701 

* 

426 

119 

709 

* 

40 

72 

28 

41 

J-M 

(MLE) 

945 

476 

4439 

451 

267 

478 

57 

54 

758 

104 

54 

S-\V 

(MLE) 

* 

* 

51492 

1818 

* 

52 

8828 

426 

2 

Binomial 

(LSE) 

1543 

770 

5011 

507 

* * * 

1 

| 

1 97 

89 

857 

117 

Binomial 

(LSE)  ! 

1610 

824 

5226 

543 

* * * 

* * * 

II 

102 

96 

893 

126 

IBM 

(LSE) 

! 1949 

947 

6547 

625 

' 

* * * 

* * * 

121 

109 

1088 

138 

IBM 

(MLE) 

1020 

493 

4346 

480 

62 

56 

| 7-12 

111 

No. 

Time 

Intervals 

J 111 

36 

19 

1 1 

111 

36 

, 

TABLE  5.  3.  16.  CHI-SQUARE  VALUES  FOR  DATA  SET  16 


Data  Set 

Model 

16-fd 

16-fdG 

16-fx 

16-fxG 

1 G-ff 

16-ffG 

GPM 

(LSE) 

3111 

1920 

869 

56 

* 

* 

279 

268 

14 

-4 

J-M 

(LSE) 

24123 

4334 

1789 

* 

* 

3342 

128 

113 

S-VV 

(LSE) 

* 

* 

90676 

* 

891591 

* 

2987 

99678 

GPM 

(MLE) 

3147 

1929 

869 

56 

* 

* 

352 

269 

| 

14 

-4 

J-M 

(MLE) 

10097 

10061 

1898 

857 

* 

1124 

1392 

48 

50 

S-W 

(MLE) 

* 

* 

* 

* 

31 

0.  03 

-i 

-4 

Binomial 

(LSE) 

♦ 

24048 

4265 

1638 

* * * 

I 

! 3331 

12 

103 

Binomial 

(LSE) 

* 

* 

4299 

1533 

*** 

127 

107 

IBM 

(LSE) 

19423 

i 23809 

4253 

1854  ' 

**  * 

*** 

2140 

i 3236 

126 

116 

IBM 

(MLE) 

9841 

9815 

1878 

835 

*** 

*** 

1096 

! 1357 

47 

49 

No. 

Time 

Intervals 

! 42 

! 28 

458 

113 

42 

28 
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TABLE  5.3.17.  BEST  FITS  BY  GPM,  J-M,  AND  S-W  MODELS 


Model 

— 

Data  Set 

X2 

Degrees  of 
Freedom 

Highest  Level  of  Significance 

for  Rejection  of  H * 

J o 

GPM  (LSE) 

5-fdG 

47 

31 

0.030 

GPM  (LSE) 

5-fxG 

26 

33 

0.  80 

GPM  (MLE) 

5-fdG 

47 

31 

0.030 

GPM  (MLE) 

5-fxG 

26 

33 

0.80 

S-W  (MLE) 

14-ffG 

1 

52 

32 

0.014 

GPM  (LSE) 

1 5-fdG 

58 

28 

0. 00073 

GPM  (LSE) 

15C-fdG 

28 

15 

0.021 

GPM  (MLE) 

15-tdG 

57 

28 

0. 0010 

J-M  (MLE) 

15B-fdG 

2 

1 

0.  16 

GPM  (MLE) 

15C-fdG 

27 

15 

0.  029 

GPM  (LSE) 

16-fxG 

56  i 

| 

110 

>0. 99099 

GPM  (MLE) 

16-fxG 

56 

110 

> 0.99999 

S-W  (MLE) 

16— ff 

31 

40 

0.  85 

S-W  (MLE) 

16-ffG 

.03 

26 

** 

GPM  (LSE) 

5-ffG 

47 

31 

0.  030 

GPM  (MLE) 

5-ffG 

47 

31 

0.  030 

* I.e.,  H0  would  not  be  rejected  at  any  significance  level  below  the  values 
tabled  here. 

**  Accurate  value  not  available  but  practically  equal  to  1. 
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Indeed,  the  only  other  cases  for  which  H0  reasonably  would  not  be  rejected  occurred 
for  the  IBM  model  in  Table  5.3.15  and  for  the  Binomial  I in  Table  5.3.12,  and  in  all 
these  cases  the  number  of  time  intervals  was  only  3 leaving  only  one  degree  of 
freedom  for  the  X2  statistic.  Therefore,  although  no  model  is  seen  to  fit  consis- 
tently, the  GPM  is  the  most  widely  applicable. 

Whether  or  not  H0  is  true  the  X2  statistic  is  a measure  of  how  closely  the 
observed  data  is  approximated  by  the  expectations  corresponding  to  the  given  model. 
In  this  sense,  models  and  data  sets  can  be  compared.  For  example,  the  GPM, 

J-M,  or  S-W  models  quite  often  showed  lower  values  of  X2  over  a fixed  data  set  than 
the  other  models.  In  fact  only  in  Tables  5.3.4  (command  and  control  data),  5.3.6 
(generalized  information  processing  system),  5.3.9  (Ref  (11,2)),  and  5.3.12  (Ref 
(II,2))were  the  GPM,  J-M,  or  S-W  outperformed.  In  all  other  cases  (i . e . , cases 
where  fits  were  achieved  with  either  the  GPM,  J-M,  or  S-W  and  the  IBM  and/or 
Binomial  models)  one  of  the  models  (GPM,  J-M,  or  S-W)  had  the  lowest  X2  value 
over  all  other  models  for  a fixed  data  set.  Except  for  Table  5.3. 12,  the  IBM 
model  had  the  lowest  X2  for  cases  where  the  GPM,  J-M,  and  S-W  were  out- 
performed. In  Table  5.  3. 12  (data  set  12-fdG)  the  Binomial  1 was  the  best. 

Finally,  it  should  be  pointed  out  that  whenever  both  the  MLE  and  LSE  tech- 
niques yielded  estimates  for  a given  model,  the  X2  value  associated  with  the  MLE 
case  is  always  smaller  than  the  X2  value  associated  with  the  LSE  case.  This  is  noc 
surprising,  however,  in  view  of  the  results  discussed  in  Section  5. 1. 2.  Further- 
more, as  discussed  in  Section  5.2,  grouping  the  data  such  that  there  were  10  or 
more  observations  per  debugging-time  interval  nearly  always  improved  the  X2  v^lue 
(note  that  all  but  one  of  the  best  fits  in  Table  5.  3. 17  were  for  these  regrouped  data 
sets).  Indeed,  many  of  the  extremely  large  normalized  X2  values  may  be  explained 
by  observing  that  for  all  models  (except  possibly  the  GPM)  the  expected  number  of 
errors  in  a given  debugging-time  interval  approaches  zero  as  the  time  interval 
length  approaches  zero.  Hence  recording  (perhaps  erroneously)  some  errors  in  a 
very  short  time  interval  will  have  profound  effects  on  the  X2  statistic. 

A possible,  perhaps  probable,  reason  for  the  poor  performance  of  these  S W 
reliability  models  as  demonstrated  in  Tables  5.  3. 1 through  5.  3. 16  is  the  violation 
of  the  model  internal  assumptions.  For  a further  discussion  of  these  considerations, 
see  Section  6.  0. 


5.4  OPTIMUM  TIME  INTERVAL  LENGTHS 

In  the  models  considered  in  this  study  the  debugging-time  interval  lengths  are 
not,  strictly  speaking,  parameters  but  are  part  of  the  observations.  Therefore, 
there  is  no  freedom  to  "choose"  them  after  the  data  have  been  collected.  Guidelines 
may  be  set  in  choosing  the  debugging-time  interval  lengths  before  data  is  collected 
but  this  was  not  done  for  any  of  the  data  in  this  report. 

Optimality  of  debugging-time  interval  lengths  must  be  defined  in  terms  of 
goodness  of  fit  for  a particular  model.  In  this  study,  nop-overlapping  debugging-time 
intervals  were  combined  so  that  the  number  of  observations  in  each  total  time 


interval  was  greater  than  or  equal  to  10  to  ensure  the  validity  of  the  distribution 
of  the  chi-square  statistic  used  for  testing  goodness  of  fit  (see  Section  5.2).  The 
results  presented  in  Section  5.  3 indicate  that  in  most  cases,  this  procedure  improved 
the  fit  of  the  model.  It  should  not,  however,  be  concluded  that  further  lengthening 
the  debugging-time  intervals  will  further  improve  the  fit.  Indeed,  one  could  combine 
all  the  intervals  into  one  large  interval  and  obtain  no  fit  since  there  would  be,  then, 
only  one  observation  and  two  or  more  parameters  to  estimate.  This  does  suggest, 
though,  that  an  optimal  time  interval  length  exists  but  the  problem  of  finding  this 
optimal  length  is  not  well  posed  because  a)  this  length  will  undoubtedly  depend  on 
the  unknown  parameters  N,  <i>  , a , a,  b,  etc.  and  b)  estimates  for  the  unknown 
parameters  cannot  be  substituted  because  they  will  depend  on  the  debugging-time 
interval  lengths.  Simulation  techniques  like  those  employed  in  Section  4.4  may  be 
useful  for  fixed  values  of  the  unknown  parameters,  but  would  be  extremely 
expensive  and  were  not  investigated  in  this  study. 


5.5  COMPARISON  OF  MODELS 


5.5.1  Basic  Model  Similarities 

Many  basic  similarities  were  seen  among  the  models  studied.  Both  the  LS 
and  ML  methods  of  obtaining  parameter  estimates  were  appropriate  for  use  with 
all  the  models.  For  the  binomial  models  of  Section  3.3,  however,  the  ML  method 
proved  to  be  much  too  costly  for  practical  use.  Applying  the  Newton -Raph son 
method  for  each  integer  value  of  N in  the  input  range,  together  with  the  evident 
need  for  a very  large  range,  caused  the  computer  runs  to  be  both  time  consuming 
and  expensive.  This  tends  to  imply  the  probable  impracticality  in  testing  only 
integer  values  of  N in  other  models,  as  well. 

Five  of  the  models  being  considered  give  estimates  for  the  parameter  N,  the 
total  number  of  errors  originally  present.  These  models  do  not  allow  for  the 
possible  introduction  of  new  errors  (as  corrections  are  made)  during  the  software 
debugging  process.  The  IBM  model  of  Section  3.2,  however,  gives  an  estimate 
for  a = lim  E (N (t )),  the  total  number  of  errors  expected  over  to  occur.  This  is 

t — 

the  only  model  which  accommodates  the  possibility  of  introducing  new  errors  while 
correcting  others.  However,  because  of  the  presence  of  this  added  factor,  the 
unknown  number  of  initial  errors  is  not  explicitly  present  in  this  model. 

Due  to  this  basic  difference  in  the  parameters  N and  a,  infei'ences  about  the 
introduction  of  new  errors  during  the  debugging  process  can  be  made,  if  the 
estimates  obtained  for  a and  N are  generally  close,  the  assumption  that  no  (or  few) 
new  errors  are  introduced  would  be  strengthened.  If,  however,  estimates  for  a 
are  much  larger  than  estimates  for  N,  the  presence  of  errors  introduced  in  the 
debugging  stage  should  be  assumed.  In  actual  testing,  both  cases  were  observed 
(see  Table  5.5.1),  although  close  estimates  were  more  prevalent.  This  may 
indicate  that  the  number  of  errors  introduced  through  debugging  is  small  relative 
to  the  number  of  errors  present  initially  in  the  software. 
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TABLE  5.  5.  1.  SELECTED  MODEL  COMPARISONS  FOR  THE  ESTIMATION  OF  N 


i 


U 

I 


Data  sets  showing  closeness  and  wide  difference  between  estimates  obtained 
for  the  model  parameters  N and  a. 


Estimates 

For 

N 


Estimates 

For 

a 


Model 

Data  Set  7-fd 

Data  Set  16-fd 

GPM  (LSE) 

4720 

2572 

J-M  (LSE) 

4451 

♦ 

S-W  (LSE) 

3787 

* 

GPM  (MLE) 

4618 

1703 

J-M  (MLE) 

4458 

3698 

S-W  (MLE) 

4186 

★ 

BIN  I (LSE) 

4450 

* 

BIN  n (LSE) 

4451 

* 

IBM 

4230 

1.22  x 108 

IBM 

4470 

L 

13826 

The  GPM  described  in  Section  3. 1 requires  estimates  of  three  different  param- 
eters, while  all  other  models  contain  only  two.  Estimating  an  increased  number  of 
parameters  provides  a larger  freedom  for  the  separation  of  different  influences  on  the 
data  sets.  In  this  three-parameter  case,  three  different  factors  influencing  the 
occurrence  of  errors  can  be  distinguished.  However,  numerical  (tractability)  compli- 
cations are  also  added,  and  convergence  of  the  estimates  becomes  even  more  difficult 
to  obtain. 

All  the  models  assume  decreasing  error  rates.  Although  this  assumption  is 
made,  very  few  of  the  actual  data  sets  have  uniformly  decreasing  error  rates.  This 
may  have  caused  convergence  problems  in  the  application  of  the  Newton -Raphson 
method. 


I 


5.5.2  Similarities  Obtained  for  Different  Models  and  Methods 

Many  inherent  similarities  among  the  models  have  been  found.  Most 
generally,  the  ML  and  LS  methods  for  obtaining  estimates  usually  resin,  d in  simi- 
lar estimates  for  the  same  model.  This  trend  supports  the  validity  of  both  methods. 
Most  noticeably,  extremely  close  estimates  of  <t>  were  obtained  through  both  methods 
for  each  of  the  Poisson  type  models  of  Section  3. 1.  The  costliness  of  the  ML 
methods  for  the  binomial  models  of  Section  3.3  is  the  only  reason  a comparison 
cannot  be  made  in  those  cases. 

For  all  the  models,  the  grouped  data  sets  had  slightly  better  convergence 
properties  (convergence  was  obtained  more  often  and  more  easily)  than  the  corre- 
sponding non-grouped  data  sets.  In  the  majority  of  cases,  grouping  altered  the 
parameter  estimates  obtained  very  little,  although  generally  better  chi-square 
values  resulted  for  these  fits.  A direct  comparison  of  these  chi-square  values  for 
one  data  set  is  given  in  Table  5.5.  2. 

Extremely  close  values  of  the  estimates  of  a (in  the  LS  version  of  the  binomial 
model  of  Section  3.3)  with  parameter  Pi  = 1 - e-a  Ti  and  <i>  (in  the  LS  version  of 
the  Jelinski-Moranda  model  described  in  Section  3.  1)  were  observed  when  both 
had  been  found.  This  can  be  explained  by  observing  the  least  squares  sums  to  be 
minimized. 

In  the  J-M  model  it  is 


2 {Ni 


while  in  the  binomial  model  it  is 


s2  ■ i i si  • 


(N  - M.  j)  (1  - e 


Notice  that  the  only  difference  occurs  between  the  values  <t>T  j and  1 - e a*. 
Series  expansion  shows  that 

1 "aTi  i ,i  (aTi)3 

1 - e 1 = 1 - (1  - ax.  + — - — + . . . ) 


(aT.l^  (a-r.)'^ 

The  portion  of  the  series,  - + — ^r~  + ~ ■ • • » >s  approximately  zero 

for  small  values  of  aT^. 

This  gives 


S { N.  - (N  - M.  ) ar.  > 

2 l—  ) i i-l  i ‘ 

i-1 

which  is  exact! v Sj  if  4>  is  substituted  for  a (similar  analysis  will  lead  to  the  same 
result  for  the  Binomial  model  with  Pj  = Ti/(Ti  + c)  for  4>  = 1/c  and  c large). 

Therefore,  since  the  values  found  for  both  4>  and  a are  usually  small,  their 
closeness  is  to  be  expected. 

5.  6 RECOMMENDED  MODELS 

Basically,  the  results  of  the  goodness-of-fit  tests  and  other  aspects  of  the 
data  analysis  indicate  none  of  the  models  fit  very  well.  However,  as  discussed  in 
Section  6.0,  the  various  data  sets  showed  obvious  tendencies  to  increasing  (in  time) 
error  rates:  more  or  less  contrary  to  the  model  assumption  that  mean  error  rates 
should  be  non-increasing.  Also  it  seems  clear  that  the  observed  increasing  error 
rate  situation  cannot  simply  be  written  off  as  due  the  introduction  of  errors  during 
the  debugging  process  or  as  due  purely  to  chance:  the  fluctuations  are  too  large. 
Thus  the  data  sets  themselves  have  some  problems  of  a nature  unknown  to  us; 
probably  lack  of  constancv  of  the  debugging  effort  both  in  terms  of  man  power  and 
the  S7W  modules  available  for  test.  Thus  with  better  data  more  fits  might  have 
been  obtained.  The  "proof"  of  this  allegation  will  be  given  shortly. 


It  is  not  surprising  that  the  Poisson  models  were  the  best  fitting  for  they 
incorporate  not  onlv  finds  (NO  but  fixes  (Mj-i).  Thus  finds  do  not  have  to  be  fixed 
immediately.  It  is  also  not  surprising  that  the  GPM  model  was  the  best  fitting  of 
Poisson  models;  it  has  3-parameters  and  hence  a great  deal  of  flexibility.  A 
possibly  disturbing  feature  of  the  GPM  results  is  that  the  estimates  of  shaping 
parameter  o virtually  always  came  out  negative  but  near,  very  near,  zero.  When 
the  estimates  of  a came  out  positive  they  were  also  near  zero.  This  indicates 
that  o = 0 and  hence  the  Poisson  mean  is  of  the  approximate  form  <t>  (N  - Mf _i ) 
since  t^O  = 1.  This  (the  data)  says  that  the  error  rate  does  not  depend  on 
much  but  it  depends  on  other  factors  extant  when  the  data  were  generated,  like 
non-constant  debugging  effort.  Thus  there  is  every  indication  that  the  failure  to 
obtain  a large  number  of  fits  was  not  the  fault  oi  the  models  above. 

In  view  of  the  above  we  believe  we  can  recommend  the  GPM.and  to  a lesser 
extent  the  J-M  and  S-W  models.  The  IBM  model  is  unsatisfactory  To f, --muon 
other  reasons,  the  reason  that  there  is  no  parameter  N in  the  model.  ~~  


5-61 


Section  6.  0 


1 


VALIDATION  OF  INTERNAL  ASSUMPTIONS 


The  crux  of  the  matter,  the  thing  that  essentially  determines  how  well  a model 
fits  the  observed  data,  is  are  the  assumptions  explicit  or  implicit  in  the  model  valid 
for  the  stochastic  process  which  actually  generated  the  data?  Another,  related  point, 
is  how  sensitive  ai'c  the  models  to  a failure  (in  the  data)  of  the  model  assumptions  to 
hold? 

The  S/YV  reliability  models  studied  in  this  report  have  no  lack  of  explicit  and 
implicit  assumptions. 

Perhaps  the  foremost  assumption  in  all  of  the  models  is  that  the  expected  (mean) 

value  of  the  number  of  errors  occurring  in  interval  i,  i = 1,  2 k (say)  is  a never 

increasing  function  of  increasing  i (provided  of  course  account  is  taken  of  the  possibly 
different  magnitudes  of  the  For  example  in  the  S-YV  model  (the  GPM  with  »=  2) 

E (Nj)  = 4>(N  - Mi-i)  Tj",  and  since  Mj_]  is  never  decreasing  in  i and  assuming 
T1  - T2  = • • • Tk*  clearly  E (Nj)  is  never  increasing  in  i.  The  IBM  model  has  a 
similar,  perhaps  more  implicit  assumption.  By  definition  for  the  IBM  model 

E [number  of  errors  in  (0,  t)J  = a (1  - e *Jt). 

Now  consider  an  interval  (t,  t + h),  t,  h 0;  i.  e. , an  interval  of  width  h. 


E [number  of  errors  in  (t,  t •+  h)] 

= E [number  of  errors  in  (0,  t + h)]  - E [number  of  errors  in  (0,  t)] 
- a(l-e-b(t4h>)  - a,l-e-bt)  » ae‘bt  (l-e-bh) 


For  a,  b > 0 as  required  by  the  IBM  model  the  function  on  the  right  in  the  last 
equation,  for  fixed  h,  is  decreasing  in  t.  A similar  result  is  true  for  the  Binomial 
models. 


This  assumption  is  extremely  plausible  because  as  errors  are  observed  and 
corrected  less  errors  remain.  However,  as  plausible  as  this  assumption  is  it  has 
associated  with  it  two  serious  questions.  First,  what  happens  if  the  assumption  is 
false  (perhaps  due  to  increased  number  of  debugging  personnel  in  the  later  stages  of 
S/YV  development)  ? Second,  what  happens  if  the  assumption  is  valid  but  due  to  pure 


6-1 


1 


random  fluctuations  the  observed  number  of  errors  (as  against  the  expected  number) 
is  increasing  (at  least  part  of  the  time)  in  i ? Regarding  the  first  question  it  is  simply 
a case  of  the  model  not  being  descriptive  of  the  physical  (stochastic)  process  gener- 
ating the  data.  We  feel  that  the  failure  of  this  assumption  to  be  valid  is  a major  (but 
not  the  only)  cause  of  the  failure  of  the  models  to  fit  the  data  with  any  reasonable 
frequency.  The  second  question  has  actually  more  serious  implications.  This 
question  is  pointed  toward  the  sensitivity  of  the  models  to  the  data.  After  all  even  if 
the  expected  number  of  errors  is  decreasing  per  unit  time,  purely  random  (e.g. , 
according  to  the  Poisson  distribution  for  the  GPM)  fluctuations  will  cause  some 
intervals  to  have  more  errors  observed  than  previous  intervals.  If  the  models  are 
over  sensitive  to  this  condition  they  are,  in  a very-  real  sense,  useless.  Now  quite 
apart  from  questions  of  goodness-of-fit  of  the  models,  we  found  (as  discussed  in 
Section  5. 1)  many  cases  of  failure  of  the  (numerical)  parameter  estimation  procedure 
to  converge  or  the  (numerical)  parameter  estimation  procedure  led  to  absurd  results 
(e.g.,  an  estimate  of  N,  the  initial  number  of  errors,  smaller  than  the  observed 
number  of  errors  already  removed).  Even  though  the  goodness-of-fit  results  (see 
Section  5.3)  were  mostly  quite  bad  as  measured  by  the  observed  standardized  chi- 
square  values  it  seems  that  the  models  are  very  sensitive  to  the  failure  of  the  data  to 
show  an  observed  decreasing  (in  time)  error  rate.  This  is  indeed  a severe  short- 
coming. To  further  emphasize  this  problem  suppose  that  a chi-square  goodness-of- 
fit  test  is  run  on  some  data  on  the  hypothesis  that  the  data  follows  a normal  distri- 
bution. If  the  data  really  are  from,  say  a gamma  distribution,  (i.e. , the  hypothesized 
model  (normal)  is  incorrect)  the  goodness-of-fit  procedure  at  least  does  not  break 
down!  That  is  it  won't  fail  to  give  some  kind  of  answer.  Apparently,  when  the 
observed  data  depart  from  the  deterministic  (decreasing  mean  error  rate)  assump- 
tion s,  for  whatever  reason,  the  models  often  blow-up  hi  the  sense  that  no  parameter 
estimates  are  even  obtainable.  To  check  the  behavior  of  the  observed  error  rate  we 
fit  parabolas  and  straight  lines  to  the  observed  error  rate  versus  time  for  a number 
of  data  sets.  Some  of  the  data  sets  were  omitted:  the  -G  (grouped)  data  sets  were 
omitted  because  they  were  not  the  original  data  and  the  ff  data  sets  were  not  used 
because  they  have  no  meaning  hi  this  calculation. 

The  straight  lines  and  parabolas,  whose  equations  are  given  hi  Table  <i.  1,  were 
not  fitted  for  any  predictive  purposes:  they  were  fitted  solely  to  try  to  obtahi  some 
single  or  overall  measure  of  the  behavior  of  the  error  rate  versus  time.  The  results 
are  depicted  in  Figures  6.1.1  - 6. 1.27.  About  one-quarter  (7/27)  of  the  straight  lines 
showed  positive  slope  indicating,  admittedly  very  roughly,  an  increashig  (in  time) 
error  rate.  Many  of  the  parabolas  reach  the  maximum  (and  hence  start  down) 
relatively  far  into  the  time  frame.  Moreover  some  of  the  parabolas  had  minimums 
instead  of  maximums  and  the  error  rate  stalled  back  up  late  hi  time.  On  inspecting 
Tables  5.  1. 1 - 5. 1. 16  it  can  be  seen  that  quite  severe  (except  for  the  IBM  model) 
convergence  problems  were  experienced  for  data  sets  1-fd,  3-fd,  3-fx  and  13-fx. 
Inspecting  the  corresponding  Figures  1,4,5  and  19  the  straight  lines  had  positive 
slope  in  3 of  the  4 cases  (the  fourth  case  had  a slope  not  all  that  different  from  zero). 
All  four  cases  had  parabolas  reaching  their  maximum  late  in  time;  hence  the  error 
rate  started  down  very  far  into  the  program.  This  would  seem  to  give  a gciv  ral 
indication  that  when  the  data  'misbehaved,  *’  albeit  due  to  purely  random  fluctuations, 
the  models  "blew-up."  Of  course  whatever  the  source  of  the  observed  increasing 
error  rate  the  models  (except  for  the  IBM  model)  were  very  sensitive  to  it.  T e IBM 
model  had  its  own  problems;  for  the  same  data  sets  (1-fd,  3-fd,  3- be,  13-£x)  t.ie  IBM 
model  gave  absurdly  high  estimates  of  a (for  3-fd  and  the  MLE  a was  almost  1013  in 
the  face  of  366  observed  errors).  For  3-fx,  convergence  for  a w\us  not  even  obtained. 
For  data  set  13-fx  a = (937)  was  below  the  observed  errors  (1789). 
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TABLE 
(x  = 


6.1.  ERROR  RATE  VERSUS  CUMULATIVE 
cumulative  elapsed  time,  dependent  variable 


TIME  EQUATIONS 
= errors/ day) 


Data  Set  No. 


Parabola 


Straight  Line 


-4.7E-5X  + 4. 1E-2X-2.5 
-6.3E-5X2  + 5.9E-2.X  - 4.  2 


3.  8E-5x2  - 7.  0E-2x  + 21.  6 


2.  lE-4x“  - 7.  6E-2x  + 14.  7 


1.  9E-5x  - 3.  7E-2x  + 20.5 


9.  lE-3x  +1.8 
1.6E-2X  + 0.91 
-5.9E-2X  +21,0 
1.  9E-2x  +7.4 
-1.7E-2X+  16.9 


3-fd 

9 

-2.8E-4X  + 7.0E-2X  - 0.27 

1.  2E-2.X  + 2.3 

3-fx 

-2. 9E-4x2  + 7. 1 E-2x  + 2.  5E-2 

1.  4E-2x  + 2.3 

5-fd 

-6.5E-4X2  + 1.4E-1X  -1.5 

-6.6E-3X+  4.  6 

5-fx 

-1.3E-4X2  + 3.7E-2X+  1.2 

_ 

-8. 1E-3X  + 4.  3 

15C-fd 

2.  2E-7X2  - 8.  5E— lx  + 0.  90 

-3.5E-4.X  + 0.  70 

l5A-fd 

-2.7E-7.X2  + 6.5E-4x  - 1.  IE-2 

— 1. 3E-5x  + 0.  35 

15B-fd 

6.  4E-9x2  - 3.  5E-5.X  + 7.  IE-2 

-2.0E-5X  + 6.  6E 

6-fd 

-1.8E-5X2  + 1.2E-2X  + 0.  63 

4.  5E-3x  + 1. 1 

8-fd 

1.2x2  - 28.  6x  + 202.9 

-8, 9x  + 147.1 

9-fd 

1.3x2  - 32. lx  + 244.4 

-10.  6x  + 183.5 

10-fd 

2.  Lx2  - 51.6.x  + 395.5 

-13.  4x  + 287.  5 

11 -fd 

2.3.x2  - 48.  6x  + 356.6 

-11.  9x  + 25.2 

12-fd 

-7.3E6.X2  + 3.  0E-3.X  + 0.  16 

- 

10,  0E-4.X  + 0.25 

15-fd 

-1.  4E-7x2  + 3. 4E-lx  + 0. 18 

2.  8E-5.X  + 0.  30 

14-fd 

-6.2E-6.X2  - 1. 9E-2x  + 5.  3 

-2.0x+  5.4 

14 -£x 

-6. 7 E— Lx2  + 7.0E-2.X  +11.0 

-6.4E-2.X  + 15.6 

13-fd 

3.  2E-6x2  - 8.  4E-3x  + 5.  6 

-5.1E-3X+  5.2 

13-fx 

-4.3E-6X2  - 1.3E-3X+  6.3 

-5.6E-3X+  6.8 

4-fd 

-2.5E-3X2  + 3.5E-1X+  19.6 

-7.7E-2X+  31.9 

4-fx 

-5. 9E-4x2  + 7. 8E-2x  + 25. 8 

-6.4E-2X  + 31.8 

16-fd 

-3.  6E-5x2  + 2.4E-2X  + 3.  4 

-5.7E-3X+  7.2 

16-fx 

-1. 2E-5.X2  + 6.  3E-3x  + 2.  1 

-2.1E-3X+  3.2 

NOTE:  EY  = 10Y 
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Figure  6.1.13.  Error  Rate  Versus  Cumulative  Time  - Data  Set  8-fd 


frrr 

' *" 

. . , 

1 

. ! 

hr 

' ; 

: ■ 

• * 

• 

• 

• 

1 

1 

7 i l 

0.00  2.00 

U.00 

6.00 

8.00 

I I T 

10.00  12.00  14.00 

1 JUL  17,191V 

— 

X 

cumulative  days  elapsed 

Figure  6.1.15.  Error  Rate  Versus  Cumulative  Time  — Data  Set  10-fd 
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Error  Rate  Versus  Cumulative  Time  — Data  Set  11  -fd 
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Figure  6.1.17.  Error  Rate  Versus  Cumulative  Time  - Data  Set  1 2-fd 
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Figure  6.1.20.  Error  Rate  Versus  Cumulative  Time  - Data  Set  14-fd 
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Figure  6.1.21.  F:.rror  Rate  Versus  Cumulative  Time  — Data  Set  14-fx 


Figure  6.1.22.  Error  Rate  Versus  Cumulative  Time  — Data  Set  15-fd 
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Regarding  the  assumption  of  decreasing  mean  error  rate  inherent  in  all  of  the 
models,  we  have  two  conclusions.  First  the  data  sets  showed  a marked  departure 
from  this  assumption  and  the  departure  appears  to  be  larger  than  just  chance 
fluctuations.  Second  the  various  models  are  sensitive  to  this  condition. 

Another  assumption  inherent  in  all  of  the  models  is  related  to  the  immediately 
preceding  discussion.  It  is  implicit  in  each  model  studied  (except  in  the  IBM  model, 
in  a limited  sense)  that  no  errors  are  introduced  in  the  course  of  discovering  and 
removing  errors.  The  possibility  of  introducing  errors  during  the  debugging  process 
could  also  cause  the  observed  increasing  error  rate  condition  previously  mentioned. 
Certainly  errors  are  introduced  during  the  debugging  process  so  that  this  (implicit) 
assumption  of  each  model  is  invalid.  The  magnitude  of  its  effect  is  unknown  to  us 
because  we  cannot  separate  the  three  possible  sources  of  increasing  error  rate  in  the 
data 

i)  chance  fluctuations 

ii)  increased  (as  time  goes  on)  manpower/ debugging  effort 

iii)  introduction  of  errors  during  the  debugging  process. 

For  the  GPM,  S-\V  and  J-M  models  there  is  another  invalid  assumption,  which, 
we  must  admit,  appears  to  be  in  the  "noise. " To  assume  that  the  number  of  errors 
occurring  in  some  finite  interval  t j • o has  a Poisson  distribution  is  to  assume, 
implicitly,  the  possibility  of  an  infinite  number  of  errors  in  tj.  Clearly  the  number 
of  errors  must  be  finite.  However  if  the  intervals  are  reasonably  large  with  a 
reasonable  number  of  them  (intervals)  there  should  be,  and  we  saw,  no  problem. 

Another  invalid  internal  assumption  of  the  Poisson  type  models  is,  because  the 
Poisson  mean  is  assumed  constant  throughout  tj,  the  implicit  assumption  that  the 
errors  found  in  Tj  are  removed  at  the  end  of  the  interval.  Much  the  same  assump- 
tion is  true  of  the  binomial  model  which  assumes  a constant  probability  of  detecting 
an  error  throughout  tj.  We  noticed  no  real  effect  of  this  problem. 

An  excellent  feature  of  the  Poisson  models  is  that  the  (Poisson)  joint  probability 
density  function  of  the  data  (the  data  are  the  k pairs  (Nj,  Tjj)  depends  not  only  on  the 
observed  number  of  errors  found  in  each  interval  Nj  but  also  the  cumulative  (to  the 
ith  interval)  number  of  errors  removed  Mj.j.  Thus,  if  errors  are  found  but  not 
fixed  immediately  the  Poisson  models  can  accommodate  this  situation. 
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RECOM MENDATIONS  FOR  F URTHER  STUDY  AND  CONCLUSIONS 

(I 

I 


The  models  did  not  look  good  in  this  study  - poor  fits  and  lack  of  convergence  of 
the  numerical  method.  (See  Section  0.0  for  a summary  of  the  major  results  of  this 
study  and  Table  7.2  for  a rank-ordering  of  models  according  to  fit. ) Actually  little  of 
this  was  due  to  poor  assumptions  inherent  (internal)  in  the  models.  It  is  true  that  the 
models  do  not  "permit"  the  introduction  of  errors  in  the  S/W  development  process. 
However,  we  feel  that  the  failure  of  the  data  to  show  plausible  (decreasing  error  rate, 
on  the  average,  in  increasing  time)  behavior  was  not  due  to  the  introduction  of  errors 
during  the  debugging  process  nor  to  random  fluctuations  but  to  the  inconsistency  of  the 
debugging  process.  As  "proof"  of  this  we  note  that  when  one  of  the  models  was  pre- 
sumed and  a simulation  run,  the  model  (from  which  the  data  was  generated)  fit  well. 
Ordinarily  this  is  not  enough  evidence  on  which  to  blame  the  data.  However,  the 
failure  of  the  parameter  estimates  to  converge  in  so  many  cases  indicates  more  was 
going  on  than  just  that  the  models  were  ill-chosen. 

Aside  from  the  problems  of  poor  fits  and  lack  of  convergence  of  the  numerical 
methods,  the  models  (summarized  in  Table  7.1)  are  certainly  not  mathematically 
unmanageable  nor  are  they  totally  lacking  in  physical  interpretation  in  terms  of 
software  development  phenomena.  Indeed,  most  models  contain  the  parameter  N, 
the  total  number  of  errors  present  in  the  software  initially,  about  which  it  is  im- 
portant to  make  inferences.  Also,  most  of  the  models  were  found  to  be  useful,  at 
least  in  theory',  in  predicting  the  time  necessary  to  complete  the  debugging  proc- 
ess. And  finally,  the  parameter  estimators  were  found  to  be  asymptotically  nor- 
mally distributed  which,  in  theory,  allows  interval  estimation  of  the  unknown 
parameters  in  large  samples  using  known  techniques  related  to  the  normal 
distribution. 

Each  of  the  models  studied  in  this  report  could  reasonably  serve  as  an  idealized 
model  for  large-scale  software  debugging  processes.  This,  of  course,  was  the 
Intention  of  the  respective  "inventors"  of  the  models  studied  herein.  It  is  obvious, 
though,  that  no  "idealized"  data  from  large-scale  software  programs  exists  and  this 

• C • « • | . . , » . • . . » • . . . 
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problem,  until  now,  has  not  been  recognized.  It  is  conceivable  that  one  or  more  of 
these  models  could  describe  the  actual  debugging  processes  on  large-scale  software 
projects  very  well,  or  conversely,  it  may  be  possible  to  revise  management  practices 
utilized  on  software  projects  so  that  one  or  more  of  the  models  will  describe  the 
process.  At  any  rate,  we  believe  that  it  can  be  concluded  from  this  study  that  the 
state  of  the  art  in  software  data  collection  practices  is  entirely  incompatible  with  the 
sophistication  of  the  models  discussed  in  this  report. 

Thus  we  feel  there  are  two  areas  requiring  further  study.  First,  a carefully 
controlled  S/W  program  shall  be  designed  and  the  debugging  process  carefully  moni- 
tored and  error  events  faithfully  recorded.  Then,  we  believe  several  of  the  models 
will  be  seen  to  fit.  As  a part  of  this  (controlled  computer  program  development)  study 
effort  standards /methods  of  S/W  error  data  collection  should  be  developed,  the  ulti- 
mate aim  being  a MILITARY  STANDARD  and/or  HANDBOOK  and/or  GUIDELINES  for 
UNIFORM  industry  S/W  error  data  collection  and  documentation  procedures. 
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Equations  Determining  MLEs 


Equations  Determining  LSEs 


(quations  Determining  LSEs 


i 


/ -aTi\ 

l)  = 0 


<*  * 


0 


* • 


Model 

Mean  Values 

Joint  Probability  Function 

1 

j 

4 

Generalized 

Poisson 

0(N  - M._1)T.a' 

a N -#N-M  >T,“ 

k |„(N  - | e 

1 1 Ni! 

i=l 

Model  (GPM) 

, 

(Sections  3.1, 

' 

4. 1.3) 

i 

i 

< 

Jelinski 

Moranda 
(Sections  3.1, 

4. 1.3) 

0(N  - 

Same  as  GPM  with  a = 1 

< 

- 

4 

Schick- 
Wolverton 
(Sections  3.1, 

4. 1.3) 

0 

1 

H* 

1 

h- * 

Same  as  GPM  with  ot  = 2 

... 

TABLE  7.1.  SUMMARY  Or  MODELS-Conttaued  *• 

< 

* ^ 

. 
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TABLE  7.  2.  RANK  ORDERING  OF  MODELS  ACCORDING  TO  FIT 


Data  Set  1 


Model 

1-fd 

1-fdG 

1-fx 

1-fxG 

GPM  (LSE) 

1 

J-M  (LSE) 

2 

4 

S-W  (LSE) 

GPM  (MLE) 

3 

1 

J-M  (MLE) 

S-W  (MLE) 

Binomial  (LSE) 

3 

I 

Binomial  (LSE) 

1 

II 

IBM  (LSE) 

1 

2 

2 

5 

IBM  (MLE) 

1 

1 

2 

Data  Set  2 


Model 

2-fd 

1 

2-fdG 

1 

2-fx 

2-fxG 

GPM  (LSE) 

1 

J-M  (LSE) 

7 

6 

S-W  (LSE) 

GPM  (MLE) 

4 

1 

J-M  (MLE) 

1 

2 

S-W  (MLE) 

Binomial  (LSE) 

5 

4 

I 

Binomial  (I£E) 

l 

6 

5 

n 

IBM  (LSE) 

3 

7 

IBM  (MLE) 

2 

3 

Note:  Models  are  ranked  from  best  fit  (1)  to  worst  fit  (possible  10).  Blank  entries 
indicate  that  no  fit  was  obtained. 


TABLE  7.2  RANK  ORDERING  OF  MODELS  ACCORDING  TO  FIT  Continued 


Data  Set  4 


Model 

4-fd 

4-fdG 

4-fx 

4-f.\G 

I 

4-ff 

4-ffG 

GPM  (LSE) 

1 

i 

1 

J-M  (LSE) 

6 

6 

6 

4 

4 

3 

S-W  (LSE) 

1 

I 

GPM  (MLE) 

l 

1 

1 

2 

1 

J-M  (MLE) 

2 

3 

3 

2 

S-W  (MLE) 

Binomial  (LSE) 

3 

4 

3 

2 

! 

I 

Binomial  (LSE) 

5 

5 

5 

3 

11 

IBM  (LSE) 

4 

7 

4 

5 

IBM  (MLE) 

1 

2 

2 

1 

1 ! 

Note:  Models  are  ranked  from  best  fit  (1)  to  worst  fit  (possible  10).  Blank  entries 
indicate  that  no  fit  was  obtained. 


Data  Set  6 


— 

Model 

6-fd 

6-fdG 

6-fx 

6-fxG 

6— ff 

6-£fG 

GPM  (LSE) 

J-M  (LSE) 

4 

5 

S-W  (LSE) 

GPM  (MLE) 

J-M  (MLE) 

6 

. 

7 

S-W  (MLE) 

5 

6 

Binomial  (LSE) 

3 

3 

I 

Binomial  iLSEl 

4 

4 

II 

IBM  (LSE) 

2 

2 

IBM  (MLE) 

1 

1 





Note:  Models  are  ranked  from  best  fit  (I)  to  worst  fit  (possible  10).  Blank  entries 
indicate  that  no  fit  was  obtained. 


GPM  (LSE) 
J-M  (LSE) 

S-VV  (LSE) 

GPM  (MLE) 
J-M  (MLE) 

S-W  (MLE) 
Binomial  (LSE) 

I 

Binomial  (LSE) 

II 

IBM  (LSE) 

IBM  (MLE) 


1 


Note:  Models  are  ranked  from  best  fit  (1)  to  worst  fit  (possible  10).  Blank  entries 
indicate  that  no  fit  was  obtained. 


TABLE  7.2.  RANK  ORDERING  OF  MODELS  ACCORDING  TO  FIT  Continued 


Data  Set  9 


GPM  (LSE) 

J-M  (LSE) 

2 

S-W  (LSE) 

2 

GPM  (MLE) 

1 

J-M  (MLE) 

* 

S-W  (MLE) 

1 

Binomial  (LSE) 

2 

I 

Binomial  (LSE) 

2 

II 

IBM  (LSE) 

4 

IBM  (MLE) 

3 

Note:  Models  are  ranked  from  best  fit  (1)  to  worst  fit  (possible  10).  Blank  entries 
indicate  that  no  fit  was  obtained. 
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GLOSSARY 


LSE 

LS 

MLE 

ML 

iid 

S/W 

SPR 

PTR 


J-M 

S-W 

Var  (.) 

E (•) 

Cov 

GPM 

N 

N. 

1 

k 

a 

N (0,  1) 


least  squares  estimate 
least  squares 

maximum  likelihood  estimate 

maximum  likelihood 

independent  and  identically  distributed 

software 

Software  Problem  Report 

Program  Trouble  Report  (Hughes  version  of  SPR) 

MLE  of  parameter  • 

LSE  of  parameter  • 

Jelinski-Moranda 

Schick-Wolverton 

Variance  of  . 

Expectation  of  • 

Covariance  of  • and  •• 

Generalized  Poisson  Model 

Total  number  of  errors  initially  present  in  S/W 

The  number  errors  observed  in  the  ith  debugging-time 
interval  (not  to  be  confused  with  N). 

number  of  debugging-time  intervals 

asymptotically  equal  to  (in  probability  when  random  quantities 
are  compared) 

Normally  distributed  with  mean  0 and  variance  1. 
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MISSION 

of 

Rome  Air  Development  Center 

MDC  plan*  and  executes  research,  development,  test  and 
detected  acquisition  programs  In  support  of  Command,  Control 
Communication*  and  Intelligence.  (C3I J activities.  Technical 
and  engineering  support  within  areas  of  technical  competence 
is  provided  to  ESD  Program  0 ibices  (POs)  and  other  ESP 
element a.  The  principal  technical  mission  areas  are 
communications,  electromagnetic  guidance  and  control,  sur- 
veillance of  ground  and  aerospace  objects,  intelligence  data 
collection  and  handling,  information  system  technology, 
ionospheric  propagation,  solid  state  sciences,  microwave 
physics  and  electronic  reliability,  maintainability  and 
compatibility. 


